Jan 31, 2014

SHEAR: sample heterogeneity estimation and assembly by reference

BMC Genomics
Sean R LandmanVipin Kumar

Abstract

Personal genome assembly is a critical process when studying tumor genomes and other highly divergent sequences. The accuracy of downstream analyses, such as RNA-seq and ChIP-seq, can be greatly enhanced by using personal genomic sequences rather than standard references. Unfortunately, reads sequenced from these types of samples often have a heterogeneous mix of various subpopulations with different variants, making assembly extremely difficult using existing assembly tools. To address these challenges, we developed SHEAR (Sample Heterogeneity Estimation and Assembly by Reference; http://vk.cs.umn.edu/SHEAR), a tool that predicts SVs, accounts for heterogeneous variants by estimating their representative percentages, and generates personal genomic sequences to be used for downstream analysis. By making use of structural variant detection algorithms, SHEAR offers improved performance in the form of a stronger ability to handle difficult structural variant types and better computational efficiency. We compare against the lead competing approach using a variety of simulated scenarios as well as real tumor cell line data with known heterogeneous variants. SHEAR is shown to successfully estimate heterogeneity percentages in both ca...Continue Reading

  • References24
  • Citations1

References

  • References24
  • Citations1

Citations

Mentioned in this Paper

Twitter Messaging
Tumor Cells, Uncertain Whether Benign or Malignant
Prostatic Neoplasms
Fluctuation
Repetitive Region
Genome
Tumor Tissue Sample
Sequence Determinations, RNA
Inversion Mutation Abnormality
Genome Assembly Sequence

Related Feeds

Cancer Genomics

Cancer genomics employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Discover the latest research here.

Cancer Genomics (Keystone)

Cancer genomics approaches employ high-throughput technologies to identify the complete catalog of somatic alterations that characterize the genome, transcriptome and epigenome of cohorts of tumor samples. Discover the latest research using such technologies in this feed.