BIROn - Birkbeck Institutional Research Online

    Semantic, hierarchical, online clustering of web search results

    Zhang, Dell and Dong, Y. (2004) Semantic, hierarchical, online clustering of web search results. In: Yu, J.X. and Lin, X. and Lu, H. and Zhang, Y. (eds.) APWeb 2004: Advanced Web Technologies and Applications. Lecture Notes in Computer Science 3007. Springer, pp. 69-78. ISBN 9783540213710.

    Full text not available from this repository.

    Abstract

    We propose a Semantic, Hierarchical, Online Clustering (SHOC) approach to automatically organizing Web search results into groups. SHOC combines the power of two novel techniques, key phrase discovery and orthogonal clustering, to generate clusters which are both reasonable and readable. Moreover, SHOC can work for multiple languages: not only English but also oriental languages like Chinese. The main contribution of this paper includes the following. (1) The benefits of using key phrases as Web document features are discussed. A key phrase discovery algorithm based on suffix array is presented. This algorithm is highly effective and efficient no matter how large the language’s alphabet is. (2) The concept of orthogonal clustering is proposed for general clustering problems. The reason why matrix Singular Value Decomposition (SVD) can provide solution to orthogonal clustering is strictly proved. The orthogonal clustering has a solid mathematics foundation and many advantages over traditional heuristic clustering algorithms.

    Metadata

    Item Type: Book Section
    School: School of Business, Economics & Informatics > Computer Science and Information Systems
    Depositing User: Sarah Hall
    Date Deposited: 15 Nov 2021 14:39
    Last Modified: 15 Nov 2021 14:39
    URI: https://eprints.bbk.ac.uk/id/eprint/46728

    Statistics

    Activity Overview
    6 month trend
    0Downloads
    6 month trend
    31Hits

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item Edit/View Item