BIROn - Birkbeck Institutional Research Online

    Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks

    Wan, Cen and Jones, D.T. (2020) Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks. Nature Machine Intelligence 2 , pp. 540-550. ISSN 2522-5839.

    [img]
    Preview
    Text
    FFPredGAN.pdf - Author's Accepted Manuscript

    Download (17MB) | Preview

    Abstract

    Protein function prediction is a challenging but important task in bioinformatics. Many prediction methods have been developed, but are still limited by the bottleneck on training sample quantity. Therefore, it is valuable to develop a data augmentation method that can generate high-quality synthetic samples to further improve the accuracy of prediction methods. In this work, we propose a novel generative adversarial networks-based method, namely FFPred-GAN, to accurately learn the high-dimensional distributions of protein sequence-based biophysical features and also generate high-quality synthetic protein feature samples. The experimental results suggest that the synthetic protein feature samples are successful in improving the prediction accuracy for all three domains of the Gene Ontology through augmentation of the original training protein feature samples.

    Metadata

    Item Type: Article
    School: Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences
    Depositing User: Cen Wan
    Date Deposited: 02 Dec 2020 17:33
    Last Modified: 09 Aug 2023 12:48
    URI: https://eprints.bbk.ac.uk/id/eprint/32666

    Statistics

    Activity Overview
    6 month trend
    657Downloads
    6 month trend
    250Hits

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item
    Edit/View Item