Bayesian performance comparison of text classifiers
Zhang, Dell and Wang, Jun and Yilmaz, Emine and Wang, Xiaoling and Yuxin, Zhou (2016) Bayesian performance comparison of text classifiers. In: UNSPECIFIED (ed.) Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR). New York, U.S.: Association for Computing Machinery, pp. 15-24. ISBN 9781450340694.
|
Text
SIGIR2016_0519_18ccb7f59.pdf - Author's Accepted Manuscript Download (884kB) | Preview |
Abstract
How can we know whether one classifier is really better than the other? In the area of text classification, since the publication of Yang and Liu's seminal SIGIR-1999 paper, it has become a standard practice for researchers to apply null-hypothesis significance testing (NHST) on their experimental results in order to establish the superiority of a classifier. However, such a frequentist approach has a number of inherent deficiencies and limitations, e.g., the inability to accept the null hypothesis (that the two classifiers perform equally well), the difficulty to compare commonly-used multivariate performance measures like F1 scores instead of accuracy, and so on. In this paper, we propose a novel Bayesian approach to the performance comparison of text classifiers, and argue its advantages over the traditional frequentist approach based on t-test etc. In contrast to the existing probabilistic model for F1 scores which is unpaired, our proposed model takes the correlation between classifiers into account and thus achieves greater statistical power. Using several typical text classification algorithms and a benchmark dataset, we demonstrate that the our approach provides rich information about the difference between two classifiers' performances.
Metadata
Item Type: | Book Section |
---|---|
Additional Information: | SIGIR '16 The 39th International ACM SIGIR conference on research and development in Information Retrieval; Pisa, Italy — July 17 - 21, 2016 |
Keyword(s) / Subject(s): | Text Classification, Performance Evaluation, Hypothesis Testing, Bayesian Inference |
School: | Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences |
Research Centres and Institutes: | Birkbeck Knowledge Lab, Data Analytics, Birkbeck Institute for |
Depositing User: | Dell Zhang |
Date Deposited: | 23 Aug 2016 10:21 |
Last Modified: | 09 Aug 2023 12:37 |
URI: | https://eprints.bbk.ac.uk/id/eprint/14878 |
Statistics
Additional statistics are available via IRStats2.