Analysis of strand-specific RNA-seq data using machine learning reveals the structures of transcription units in Clostridium thermocellum

Nucleic Acids Research
Wen-Chi ChouYing Xu

Abstract

Identification of transcription units (TUs) encoded in a bacterial genome is essential to elucidation of transcriptional regulation of the organism. To gain a detailed understanding of the dynamically composed TU structures, we have used four strand-specific RNA-seq (ssRNA-seq) datasets collected under two experimental conditions to derive the genomic TU organization of Clostridium thermocellum using a machine-learning approach. Our method accurately predicted the genomic boundaries of individual TUs based on two sets of parameters measuring the RNA-seq expression patterns across the genome: expression-level continuity and variance. A total of 2590 distinct TUs are predicted based on the four RNA-seq datasets. Among the predicted TUs, 44% have multiple genes. We assessed our prediction method on an independent set of RNA-seq data with longer reads. The evaluation confirmed the high quality of the predicted TUs. Functional enrichment analyses on a selected subset of the predicted TUs revealed interesting biology. To demonstrate the generality of the prediction method, we have also applied the method to RNA-seq data collected on Escherichia coli and achieved high prediction accuracies. The TU prediction program named SeqTU is pub...Continue Reading

References

May 29, 2004·Pharmacogenomics·Simon Bennett
Jul 15, 2004·Nucleic Acids Research·Liangsu WangCarlos Zamudio
Aug 2, 2005·Nature·Marcel MarguliesJonathan M Rothberg
Dec 31, 2005·Nucleic Acids Research·Shujiro OkudaMinoru Kanehisa
Apr 19, 2008·Briefings in Bioinformatics·Rutger W W BrouwerSacha A F T van Hijum
Oct 25, 2008·Nucleic Acids Research·Mihaela PerteaSteven L Salzberg
Nov 8, 2008·Nucleic Acids Research·Fenglou MaoYing Xu
Feb 24, 2009·Proceedings of the National Academy of Sciences of the United States of America·D R Yoder-HimesR Sorek
May 19, 2009·Nature·Alejandro Toledo-AranaPascale Cossart
Nov 3, 2009·Nature Biotechnology·Byung-Kwan ChoBernhard Ø Palsson
Nov 20, 2009·Nucleic Acids Research·Marco AlbrechtThomas Rudel
Dec 8, 2009·Science·Marc GüellLuis Serrano
Dec 15, 2010·Nucleic Acids Research·Guojun LiYing Xu
Jun 10, 2011·Nucleic Acids Research·Paulina JackowiakMarek Figlerowicz
Aug 25, 2011·Genetics and Molecular Research : GMR·A C PintoV Azevedo
Nov 17, 2011·DNA Research : an International Journal for Rapid Publication of Reports on Genes and Genomes·Franciele Maboni SiqueiraIrene Silveira Schrank
May 30, 2013·Nucleic Acids Research·Ryan McClureBrian Tjaden
Apr 23, 2014·Nucleic Acids Research·Qin MaYing Xu
Jun 3, 2014·BMC Bioinformatics·Vittorio FortinoDario Greco
Jul 6, 2014·Molecular Systems Biology·Aarash BordbarBernhard O Palsson

❮ Previous
Next ❯

Citations

Mar 8, 2017·Scientific Reports·Xin ChenYing Xu
Aug 15, 2020·Briefings in Bioinformatics·Zhaoqian LiuBingqiang Liu
Nov 19, 2019·Frontiers in Microbiology·Qin MaSenthil Subramanian
Oct 3, 2017·Briefings in Bioinformatics·Huansheng CaoYing Xu
Jun 14, 2020·Applied and Environmental Microbiology·Ewa LewickaGrazyna Jagura-Burdzy
Jul 29, 2018·Metabolic Engineering·Kamil CharubinEleftherios T Papoutsakis
May 7, 2021·Briefings in Bioinformatics·Qi WangBingqiang Liu

❮ Previous
Next ❯

Datasets Mentioned

BETA
SRP002548
SRX315218

Methods Mentioned

BETA
RNA-seq
ssRNA-seq

Software Mentioned

DOOR
GSMapper
FastQC
libSVM
DSN
SeqClean
Burrows Aligner ( BWA )
BWA
DAVID

Related Concepts

Related Feeds

Basal Ganglia

Basal Ganglia are a group of subcortical nuclei in the brain associated with control of voluntary motor movements, procedural and habit learning, emotion, and cognition. Here is the latest research.