BIROn - Birkbeck Institutional Research Online

    An empirical evaluation of hierarchical feature selection methods for classification in Bioinformatics datasets with gene ontology-based features

    Wan, Cen and Freitas, A. (2017) An empirical evaluation of hierarchical feature selection methods for classification in Bioinformatics datasets with gene ontology-based features. Artificial Intelligence Review , ISSN 0269-2821.

    [img]
    Preview
    Text
    AIReview.pdf - Author's Accepted Manuscript

    Download (369kB) | Preview

    Abstract

    Hierarchical feature selection is a new research area in machine learning/data mining, which consists of performing feature selection by exploiting dependency relationships among hierarchically structured features. This paper evaluates four hierarchical feature selection methods, i.e., HIP, MR, SHSEL and GTD, used together with four types of lazy learning-based classifiers, i.e., Naïve Bayes, Tree Augmented Naïve Bayes, Bayesian Network Augmented Naïve Bayes and k-Nearest Neighbors classifiers. These four hierarchical feature selection methods are compared with each other and with a well-known “flat” feature selection method, i.e., Correlation-based Feature Selection. The adopted bioinformatics datasets consist of aging-related genes used as instances and Gene Ontology terms used as hierarchical features. The experimental results reveal that the HIP (Select Hierarchical Information Preserving Features) method performs best overall, in terms of predictive accuracy and robustness when coping with data where the instances’ classes have a substantially imbalanced distribution. This paper also reports a list of the Gene Ontology terms that were most often selected by the HIP method.

    Metadata

    Item Type: Article
    Additional Information: The final publication is available at Springer via the link above.
    School: Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences
    Research Centres and Institutes: Bioinformatics, Bloomsbury Centre for (Closed)
    Depositing User: Cen Wan
    Date Deposited: 31 Jan 2020 09:13
    Last Modified: 09 Aug 2023 12:47
    URI: https://eprints.bbk.ac.uk/id/eprint/30710

    Statistics

    Activity Overview
    6 month trend
    274Downloads
    6 month trend
    158Hits

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item Edit/View Item