An integrated mass-spectrometry pipeline identifies novel protein coding-regions in the human genome.

PloS One
Danny A BittonCrispin J Miller

Abstract

Most protein mass spectrometry (MS) experiments rely on searches against a database of known or predicted proteins, limiting their ability as a gene discovery tool. Using a search against an in silico translation of the entire human genome, combined with a series of annotation filters, we identified 346 putative novel peptides [False Discovery Rate (FDR)<5%] in a MS dataset derived from two human breast epithelial cell lines. A subset of these were then successfully validated by a different MS technique. Two of these correspond to novel isoforms of Heterogeneous Ribonuclear Proteins, while the rest correspond to novel loci. MS technology can be used for ab initio gene discovery in human data, which, since it is based on different underlying assumptions, identifies protein-coding genes not found by other techniques. As MS technology continues to evolve, such approaches will become increasingly powerful.

References

Oct 5, 1990·Journal of Molecular Biology·S F AltschulD J Lipman
Jan 1, 1993·Annual Review of Biochemistry·G DreyfussC G Burd
Apr 25, 1997·Journal of Molecular Biology·C Burge, S Karlin
Jul 17, 1998·Current Opinion in Structural Biology·C B Burge, S Karlin
Aug 2, 2000·Proceedings of the National Academy of Sciences of the United States of America·S P GygiR Aebersold
Feb 22, 2001·Science·J C VenterX Zhu
Oct 27, 2001·Proteomics·J S ChoudharyJ S Cottrell
Dec 26, 2001·Nucleic Acids Research·T HubbardM Clamp
Dec 26, 2001·Nucleic Acids Research·Anthony KerlavagePaul Thomas
Jan 10, 2002·Trends in Biotechnology·J S ChoudharyJ S Cottrell
May 4, 2002·Science·Philipp KapranovThomas R Gingeras
Oct 9, 2002·Genome Research·Jason E StajichEwan Birney
Jan 29, 2003·Proceedings of the National Academy of Sciences of the United States of America·Roderic GuigoMichael R Brent
Mar 14, 2003·Nature·Ruedi Aebersold, Matthias Mann
Apr 13, 2004·Molecular & Cellular Proteomics : MCP·Steven CarrUNKNOWN Working Group on Publication Guidelines for Peptide and Protein Identification Data
Oct 6, 2004·Genome Biology·Robert C GentlemanJianhua Zhang
Mar 26, 2005·Science·Jill ChengThomas R Gingeras
Jun 21, 2005·Genome Research·Gregory M CooperArend Sidow
Nov 23, 2005·Bioinformatics·Konstantin ArnoldTorsten Schwede
Dec 31, 2005·Nucleic Acids Research·E BirneyT J P Hubbard
Apr 15, 2006·Science·Bruno Domon, Ruedi Aebersold
Jul 25, 2006·Genome Biology·Gabor HalaszHarmen J Bussemaker
Aug 24, 2006·Genome Biology·Roderic GuigóMartin G Reese
Dec 26, 2006·Genome Research·Stephen TannerVineet Bafna
Mar 22, 2007·BioTechniques·Michał J OkoniewskiCrispin J Miller
May 15, 2007·Genome Biology·Michał J OkoniewskiCrispin J Miller
Oct 13, 2007·Nucleic Acids Research·Tim YatesCrispin J Miller
Dec 11, 2007·Journal of Proteome Research·Lukas KällWilliam Stafford Noble
May 3, 2008·Science·Ugrappa NagalakshmiMichael Snyder
Aug 15, 2008·Journal of Proteome Research·Wilfred H TangSean L Seymour

❮ Previous
Next ❯

Citations

Oct 6, 2010·Applied Microbiology and Biotechnology·José Miguel P Ferreira de Oliveira, Leo H de Graaff
May 24, 2014·Nature Communications·Hui Sun LeongCrispin J Miller
Dec 19, 2014·BMC Genomics·Julian UszkoreitMartin Eisenacher
Apr 14, 2011·BMC Plant Biology·Mohamed HelmyYasushi Ishihama
Dec 21, 2013·PLoS Biology·William C Earnshaw
Nov 19, 2013·Nature Methods·Rui M M BrancaJanne Lehtiö
Jun 13, 2012·Genes to Cells : Devoted to Molecular & Cellular Mechanisms·Mohamed HelmyYasushi Ishihama
Oct 18, 2014·Bioinformatics·Fei DengXiaowen Liu
Apr 7, 2016·Annual Review of Analytical Chemistry·Gloria M SheynkmanLloyd M Smith
Aug 7, 2012·Molecular & Cellular Proteomics : MCP·Manfred Claassen
May 1, 2013·Molecular & Cellular Proteomics : MCP·Gloria M SheynkmanLloyd M Smith
Feb 26, 2015·Journal of Proteome Research·Giulia GonnelliSven Degroeve
Aug 19, 2017·Journal of Proteome Research·Tiaan HeunisDavid L Tabb

❮ Previous
Next ❯

Datasets Mentioned

BETA
GSE19154

Methods Mentioned

BETA
in silico methods
chip
Electrophoresis
Reverse Transcription PCR

Software Mentioned

R package
ProteinPilot
BioConductor
Ensembl
BioPerl Ensembl API script
exonmap
Ensembl API
Sequest
fastacmd
Perl script

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.