BIROn - Birkbeck Institutional Research Online

    Experience of using SVM for the triage task in TREC 2004 genomics track

    Zhang, Dell and Lee, W.S. (2004) Experience of using SVM for the triage task in TREC 2004 genomics track. In: Voorhees, E.M. and Buckland, L.P. (eds.) TREC 2004: Proceedings of the Thirteenth Text REtrieval Conference. NIST Special Publication 500. The National Institute of Standards and Technology.

    Full text not available from this repository.


    This paper reports our knowledge-ignorant machine learning approach to the triage task in TREC2004 genomics track, which is actually a text categorization problem. We applied Support Vector Machine (SVM) and found that information-gain based feature selection is helpful. Although we achieved decent performance in leave-one-out cross-validation experiments, the evaluation result on the test data turned out to be surprisingly poor. Further experiments revealed that there is a chasm between the training and test data distributions. It seems that more aggressive feature selection can partially alleviate the trouble caused by distribution change.


    Item Type: Book Section
    School: Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences
    Depositing User: Sarah Hall
    Date Deposited: 15 Nov 2021 15:31
    Last Modified: 09 Aug 2023 12:52


    Activity Overview
    6 month trend
    6 month trend

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item Edit/View Item