--- layout: post title: "The methodology of literary taxonomy: HARKing and p-hacking" categories: [literature, methodology] tags: [literature, methodology] published: True --- A central anxiety for literary studies in the era of scientific dominance pertains to the extent to which groupings, taxonomies and classifications are methodologically derived and how far they help us to understand literary production. We all invariably use and create such classifications as terminological shorthands. But literary taxonomies are generated _post hoc_ – formulated in the light of observation, rather (usually) than being hypothesized and then confirmed by observation. This was recently described to me by one of my scientific colleagues as HARKing: [Hypothesizing After Results are Known](http://dx.doi.org/10.1207/s15327957pspr0203_4). The logic here runs that a hypothesis should not be formulated by recourse to the data against which it will be tested, since this can only ever lead to a hypothesis being true. The other related methodological “flaw”, at least so far as those versed in scientific methods would see it, is that commonalities between texts are created by _ex post facto_ subgroup analyses. Rather, say, than positing a causal relationship that might give predictive force to measurable stylometric and thematic contents across all works, classifications are first read out of a corpus and then the data are dredged to select only works that exhibit such characteristics. In other words, again, any “hypothesis” or theory here contains all the data that could also confirm it; a type of circular “p-hacking”, as the practice is known. But literary studies is not science. These methods might not pass muster in a laboratory or a clinical trial, but they have resulted in startling critical insights and fruitful groupings of texts. I think of fields known well to me and the descriptions, say, of historiographic metafiction that can usefully be used to refer to clusters of works. This is probably because, although statistical methods can be applied to literary works, literary works are unique and non-repeatable and the goal of literary criticism is not, in every case, to determine a causal sequence between material conditions and production. The one-time classification of literary works from a single dataset is not always (or indeed usually) meant to answer future speculation but to profitably understand past production. Indeed, criticisms of a limited corpus aside, an accurately drawn taxonomy would have already used the entire available dataset and would, therefore, be using the only source that could either confirm or deny its truth. Accusations of HARKing and p-hacking are only valid within sampling or predictive environments and so do not frequently apply to the work of literary studies. And yet, a voice bugs me, we do sample in literary studies. As ever, there is always too much to read. We resort to case studies to emblematically demonstrate a broader point. As computational, quantified and scientistic approaches to literary study continue to gain traction, I suspect that this methodological debate will only grow louder.