--- layout: post status: publish published: true title: Crude, but helpful, typesetting script from meXml wordpress_id: 2440 wordpress_url: https://www.martineve.com/?p=2440 date: !binary |- MjAxMi0xMC0xOSAxOTowODowOCArMDIwMA== date_gmt: !binary |- MjAxMi0xMC0xOSAxODowODowOCArMDIwMA== categories: - Technology - Open Access - Academia - Output tags: - OA - tools comments: [] ---

In my quest to create a set of free and open tools for platinum, scholar-run OA journals, I've just committed a crude, provisional script to my meXml git repository that assists with typesetting into pseudo-NLM format.

A few notes. First of all, what does it do? The script parses markup output from the wysihtml5 tool and converts it into near-as-damnit the format I need for typesetting. The idea is that I paste a LibreOffice document (with endnotes, not footnotes) into the tool, and grab the markup it returns. This python script then parses it one step further into the format that I need to generate galleys for Orbit with the /tools/gengalleys.sh tool. I still have to clean up the markup, but this has reduced the time it takes to typeset down from about 6-8 hours to about 2 hours.

Is this the best way to do it? Almost certainly not. Why? Because the way to do it properly would be to create a set of LibreOffice styles (or whatever mechanism it uses) and then get their tool to directly output in the format that's needed. It's also the case that I have broken compatibility with NLM standards (at present) in my implementation. This is something that I aim to gradually fix; I needed to break it at the time to Just Get It Done (trademark).

Is the script complete? Nope. It's pretty sloppy coding, too. Again, this is a dirty hack that I've committed in case it helps somebody.

Disclaimers done: enjoy!