BIROn - Birkbeck Institutional Research Online

    Text classification with kernels on the multinomial manifold

    Zhang, Dell and Chen, X. and Lee, W.S. (2005) Text classification with kernels on the multinomial manifold. In: Baeza-Yates, R.A. and Ziviani, N. and Marchionini, G. and Moffat, A. and Tait, J. (eds.) SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, pp. 266-273. ISBN 9781595930347.

    Full text not available from this repository.

    Abstract

    Support Vector Machines (SVMs) have been very successful in text classification. However, the intrinsic geometric structure of text data has been ignored by standard kernels commonly used in SVMs. It is natural to assume that the documents are on the multinomial manifold, which is the simplex of multinomial models furnished with the Riemannian structure induced by the Fisher information metric. We prove that the Negative Geodesic Distance (NGD) on the multinomial manifold is conditionally positive definite (cpd), thus can be used as a kernel in SVMs. Experiments show the NGD kernel on the multinomial manifold to be effective for text classification, significantly outperforming standard kernels on the ambient Euclidean space.

    Metadata

    Item Type: Book Section
    School: School of Business, Economics & Informatics > Computer Science and Information Systems
    Depositing User: Sarah Hall
    Date Deposited: 15 Nov 2021 14:15
    Last Modified: 15 Nov 2021 14:15
    URI: https://eprints.bbk.ac.uk/id/eprint/46726

    Statistics

    Activity Overview
    6 month trend
    0Downloads
    6 month trend
    31Hits

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item Edit/View Item