Brown, C.T. and Moritz, D. and O'Brien, M.P. and Reidl, Felix and Reiter, T. and Sullivan, B.D. (2020) Exploring neighborhoods in large metagenome assembly graphs reveals hidden sequence diversity. Genome Biology 21 (164), ISSN 1474-760X.
Text
main.pdf - Author's Accepted Manuscript Restricted to Repository staff only Download (845kB) | Request a copy |
||
|
Text
32137a.pdf - Published Version of Record Available under License Creative Commons Attribution. Download (4MB) | Preview |
Abstract
Genomes computationally inferred from large metagenomic data sets are often incomplete and may be missing functionally important content and strain variation. We introduce an information retrieval system for large metagenomic data sets that exploits the sparsity of DNA assembly graphs to efficiently extract subgraphs surround- ing an inferred genome. We apply this system to recover missing content from genome bins and show that substantial genomic se- quence variation is present in a real metagenome. Our software implementation is available at https://github.com/spacegraphcats/ spacegraphcats under the 3-Clause BSD License.
Metadata
Item Type: | Article |
---|---|
School: | Birkbeck Faculties and Schools > Faculty of Business and Law > Birkbeck Business School |
Depositing User: | Felix Reidl |
Date Deposited: | 29 Jun 2020 09:25 |
Last Modified: | 02 Aug 2023 18:00 |
URI: | https://eprints.bbk.ac.uk/id/eprint/32137 |
Statistics
Additional statistics are available via IRStats2.