BIROn - Birkbeck Institutional Research Online

    Learning to see the wood for the trees: machine learning, decision trees and the classification of isolated theropod teeth

    Wills, S. and Underwood, Charlie J. and Barrett, P. (2021) Learning to see the wood for the trees: machine learning, decision trees and the classification of isolated theropod teeth. Palaeontology 64 (1), pp. 75-99. ISSN 0031-0239.

    40914.pdf - Author's Accepted Manuscript

    Download (1MB) | Preview


    Taxonomic identification of fossils based on morphometric data traditionally relies on the use of standard linear models to classify such data. Machine learning and decision trees offer powerful alternative approaches to this problem but are not widely used in palaeontology. Here, we apply these techniques to published morphometric data of isolated theropod teeth in order to explore their utility in tackling taxonomic problems. We chose two published datasets consisting of 886 teeth from 14 taxa and 3020 teeth from 17 taxa, respectively, each with five morphometric variables per tooth. We also explored the effects that missing data have on the final classification accuracy. Our results suggest that machine learning and decision trees yield superior classification results over a wide range of data permutations, with decision trees achieving accuracies of 96% in classifying test data in some cases. Missing data or attempts to generate synthetic data to overcome missing data seriously degrade all classifiers predictive accuracy. The results of our analyses also indicate that using ensemble classifiers combining different classification techniques and the examination of posterior probabilities is a useful aid in checking final class assignments. The application of such techniques to isolated theropod teeth demonstrate that simple morphometric data can be used to yield statistically robust taxonomic classifications and that lower classification accuracy is more likely to reflect preservational limitations of the data or poor application of the methods.


    Item Type: Article
    Additional Information: This is the peer reviewed version of the article, which has been published in final form at the link above. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving.
    Keyword(s) / Subject(s): machine learning, discriminant analysis, decision trees, classification, Theropoda, teeth
    School: Birkbeck Faculties and Schools > Faculty of Science > School of Natural Sciences
    Research Centres and Institutes: Earth and Planetary Sciences, Institute of
    Depositing User: Charles Underwood
    Date Deposited: 07 Oct 2020 11:03
    Last Modified: 02 Aug 2023 18:04


    Activity Overview
    6 month trend
    6 month trend

    Additional statistics are available via IRStats2.

    Archive Staff Only (login required)

    Edit/View Item Edit/View Item