--- layout: post status: publish published: true title: Typesetting JATS bibliographies using CSL and Zotero wordpress_id: 3139 wordpress_url: https://www.martineve.com/?p=3139 date: !binary |- MjAxNC0wNi0yMiAwODowODozMyArMDIwMA== date_gmt: !binary |- MjAxNC0wNi0yMiAwNzowODozMyArMDIwMA== categories: - Technology - Open Access - Academia tags: - XML - OA - JATS comments: [] ---

One of the hardest parts of typesetting articles for scholarly publication in the JATS standard, especially when using homemade tools, is the bibliography. JATS (and its NLM predecessors) expects references to be broken down into their constituent components where possible in order to be semantically rich. For example:

{% highlight xml %} Royall Tyler The Contrast The Norton Anthology of American Literature, Vol. A: Beginnings to 1820 Franklin Wayne Gura Philip F. Krupat Arnold Baym Nina W. W. Norton & Company New York 765 805 {% endhighlight %}

This is all very well, but it also creates a problem. How do we get from the author's plaintext citation to this structured format? Parsing references is hard. Very hard. My closest efforts in the past have been to write a cascading regular expression engine, meCite, to which anybody is willing to contribute. I do intend to do more on this at some point.

Late last year, however, Martin Fenner was investigating whether CSL could be used to generate a JATS bibliography. His current efforts were in the pandoc-JATS repository. These efforts stopped, however, following a discussion on the xbiblio mailing list where it was decided that CSL was not ideal for generating structured XML.

This may be true. However, there are a lack of viable alternatives for typesetting references. Furthermore, Zotero and Mendeley (both of which use CSL to generate their citations) have vast databases publicly available for the scholarly literature. If we could use CSL to generate valid JATS XML, this would substantially reduce the time needed to typeset a JATS bibliography. To that end, I have taken on maintenance of a fork of Martin's original efforts. Last night, with the first commits, I fixed DOI display, added book chapter support, added support for editors and changed the book title field to the correct "source" implementation. My fork can be found at the JATS-CSL repo.

While the approach may not be recommended, it is far better than nothing and I'll push it as far as I can in an effort to save some time!