Nov 7, 2019

Mash Screen: high-throughput sequence containment estimation for genome discovery

Genome Biology
Brian D. OndovAdam M. Phillippy

Abstract

The MinHash algorithm has proven effective for rapidly estimating the resemblance of two genomes or metagenomes. However, this method cannot reliably estimate the containment of a genome within a metagenome. Here, we describe an online algorithm capable of measuring the containment of genomes and proteomes within either assembled or unassembled sequencing read sets. We describe several use cases, including contamination screening and retrospective analysis of metagenomes for novel genome discovery. Using this tool, we provide containment estimates for every NCBI RefSeq genome within every SRA metagenome and demonstrate the identification of a novel polyomavirus species from a public metagenome.

  • References
  • Citations1

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations1

Citations

Mentioned in this Paper

Steroid receptor RNA activator
Genome
Metagenome
Ncbi Taxonomy
Proteome
Screening Generic
Species
Retrospective Studies
Polyomavirus
Drug Discovery

Related Feeds

BK Virus Infection

BK virus infection is a significant complication of modern immunosuppression used in kidney transplantation. Discover the latest research on BK virus infection here.

Related Papers

BioRxiv : the Preprint Server for Biology
Brian D. OndovAdam M. Phillippy
BioRxiv : the Preprint Server for Biology
Brian D. OndovAdam M. Phillippy
Analytical and Bioanalytical Chemistry
Huiming YuanYukui Zhang
© 2020 Meta ULC. All rights reserved