Levene, Mark and Wood, P.T. (2002) XML structure compression. CEUR Workshop Proceedings 702 , pp. 56-69. ISSN 1613-0073.
Abstract
XML is becoming the universal language for communicating information on the Web and has gained wide acceptance through its standardisation. As such XML plays an important enabling role for dynamic computation over the Web. Compression of XML documents is crucial in this process as, in its raw form, it often contains a sizable amount of redundancy. Several XML compression algorithms have been proposed but none make use of the DTD when it is available. Here we present a novel compression algorithm for XML documents that conform to a given DTD, that separates the document’s structure from its data, taking advantage of the regular structure of XML elements. Our approach seems promising as we are able to show that it minimises the length of encoding under the assumption that document elements are independent of each other. Our presentation is a preliminary investigation; it remains to carry out experiments to validate our approach on real data.
Metadata
Item Type: | Article |
---|---|
School: | Birkbeck Faculties and Schools > Faculty of Science > School of Computing and Mathematical Sciences |
Depositing User: | Sarah Hall |
Date Deposited: | 01 Jun 2021 17:39 |
Last Modified: | 09 Aug 2023 12:51 |
URI: | https://eprints.bbk.ac.uk/id/eprint/44565 |
Statistics
Additional statistics are available via IRStats2.