Mar 20, 2008

Velvet: algorithms for de novo short read assembly using de Bruijn graphs

Genome Research
Daniel R Zerbino, Ewan Birney

Abstract

We have developed a new set of algorithms, collectively called "Velvet," to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words (k-mers) that is ideal for high coverage, very short read (25-50 bp) data sets. Applying Velvet to very short reads and paired-ends information only, one can produce contigs of significant length, up to 50-kb N50 length in simulations of prokaryotic data and 3-kb N50 on simulated mammalian BACs. When applied to real Solexa data sets without read pairs, Velvet generated contigs of approximately 8 kb in a prokaryote and 2 kb in a mammalian BAC, in close agreement with our simulated results without read-pair information. Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies.

  • References25
  • Citations4226

References

Mentioned in this Paper

Severe Acute Respiratory Syndrome
In Silico
Genome
Streptococcus
Bacterial Artificial Chromosomes
Genomics
Sequence Determinations, DNA
Prokaryote
Genome, Bacterial
Genome, Human

Related Feeds

Artificial Chromosomes

Artificial chromosomes are genetically engineered chromosomes derived from the DNA of a species. Discover the latest research on artificial chromosomes here.