BIROn - Birkbeck Institutional Research Online

    Text classification with kernels on the multinomial manifold

    Zhang, Dell and Chen, X. and Lee, W.S. (2005) Text classification with kernels on the multinomial manifold. In: Baeza-Yates, R.A. and Ziviani, N. and Marchionini, G. and Moffat, A. and Tait, J. (eds.) SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, pp. 266-273. ISBN 9781595930347.

    Full text not available from this repository.


    Support Vector Machines (SVMs) have been very successful in text classification. However, the intrinsic geometric structure of text data has been ignored by standard kernels commonly used in SVMs. It is natural to assume that the documents are on the multinomial manifold, which is the simplex of multinomial models furnished with the Riemannian structure induced by the Fisher information metric. We prove that the Negative Geodesic Distance (NGD) on the multinomial manifold is conditionally positive definite (cpd), thus can be used as a kernel in SVMs. Experiments show the NGD kernel on the multinomial manifold to be effective for text classification, significantly outperforming standard kernels on the ambient Euclidean space.


    Item Type: Book Section
    School: Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences
    Depositing User: Sarah Hall
    Date Deposited: 15 Nov 2021 14:15
    Last Modified: 09 Aug 2023 12:52


    Activity Overview
    6 month trend
    6 month trend

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item Edit/View Item