Combining machine learning and homology-based approaches to accurately predict subcellular localization in Arabidopsis.

Plant Physiology
Rakesh KaundalPatrick X Zhao

Abstract

A complete map of the Arabidopsis (Arabidopsis thaliana) proteome is clearly a major goal for the plant research community in terms of determining the function and regulation of each encoded protein. Developing genome-wide prediction tools such as for localizing gene products at the subcellular level will substantially advance Arabidopsis gene annotation. To this end, we performed a comprehensive study in Arabidopsis and created an integrative support vector machine-based localization predictor called AtSubP (for Arabidopsis subcellular localization predictor) that is based on the combinatorial presence of diverse protein features, such as its amino acid composition, sequence-order effects, terminal information, Position-Specific Scoring Matrix, and similarity search-based Position-Specific Iterated-Basic Local Alignment Search Tool information. When used to predict seven subcellular compartments through a 5-fold cross-validation test, our hybrid-based best classifier achieved an overall sensitivity of 91% with high-confidence precision and Matthews correlation coefficient values of 90.9% and 0.89, respectively. Benchmarking AtSubP on two independent data sets, one from Swiss-Prot and another containing green fluorescent protei...Continue Reading

References

Jun 3, 1988·Science·J A Swets
Nov 25, 1993·Nucleic Acids Research·S L Fennoy, J Bailey-Serres
Feb 28, 1997·Journal of Molecular Biology·J CedanoE Querol
Sep 1, 1997·Nucleic Acids Research·S F AltschulD J Lipman
Mar 26, 1998·Journal of Molecular Biology·M A AndradeB Rost
Jan 5, 2000·Proceedings of the National Academy of Sciences of the United States of America·M P BrownD Haussler
Jul 13, 2000·Journal of Molecular Biology·O EmanuelssonG von Heijne
Jan 10, 2002·Proceedings of the National Academy of Sciences of the United States of America·Samuel KarlinAndrew J Gentles
Aug 15, 2002·Genome Research·Richard MottChris P Ponting
Dec 31, 2002·BioTechniques·Gregory A Michaud, Michael Snyder
Jan 4, 2003·Briefings in Bioinformatics·Olof Emanuelsson
Feb 8, 2003·Trends in Biotechnology·Steven W TaylorSoumitra S Ghosh
Sep 12, 2003·Bioinformatics·J J WardD T Jones
Oct 17, 2003·Nature·Won-Ki HuhErin K O'Shea
Mar 19, 2004·Current Biology : CB·Torsten KleffmannSacha Baginsky
May 14, 2004·Plant Physiology·Guo-Wei TianVitaly Citovsky
Jun 3, 2004·Proteomics·Gisbert Schneider, Uli Fechner
Oct 7, 2004·Genome Research·Michelle S ScottMichael T Hallett
Dec 21, 2004·Nucleic Acids Research·Nuwee Wiwatwattana, Anuj Kumar
Dec 21, 2004·Nucleic Acids Research·John W S BrownDavid F Marshall
Dec 22, 2004·The Plant Journal : for Cell and Molecular Biology·Olga A KorolevaJohn H Doonan
Apr 6, 2005·Journal of Molecular Biology·Rajesh Nair, Burkhard Rost
Oct 8, 2005·Expert Review of Proteomics·Songqin PanNatasha V Raikhel
Apr 19, 2006·Proceedings of the National Academy of Sciences of the United States of America·Tom P J DunkleyKathryn S Lilley
Jan 1, 1990·Plant Physiology·W H Campbell, G Gowri
Jul 1, 2006·Biochemical and Biophysical Research Communications·Kuo-Chen Chou, Hong-Bin Shen
Jul 5, 2006·Journal of Bioinformatics and Computational Biology·Natalya S BogatyrevaOxana V Galzitskaya
Sep 6, 2006·EMBO Reports·Jens S Andersen, Matthias Mann
Sep 20, 2006·Journal of Cellular Biochemistry·Kuo-Chen Chou, Hong-Bin Shen

❮ Previous
Next ❯

Citations

Mar 16, 2012·Omics : a Journal of Integrative Biology·Lecong ZhouRuth Grene
Sep 29, 2011·Briefings in Bioinformatics·Yves Sucaet, Taru Deva
Jul 25, 2012·Molecular Biology and Evolution·Marianne TardifLaurent Cournac
Nov 17, 2010·BMC Bioinformatics·Yao-Qing Shen, Gertraud Burger
Nov 8, 2013·BMC Genomics·James A AnsteadGary A Thompson
Nov 16, 2013·Trends in Plant Science·Seung Yon Rhee, Marek Mutwil
Sep 17, 2014·Trends in Plant Science·Chuang MaXiangfeng Wang
Jan 8, 2013·Trends in Plant Science·Olga ŠamajováJozef Šamaj
May 9, 2013·The FEBS Journal·Kyu KwonChankyu Park
Nov 21, 2015·BioMed Research International·Lina ZhangRuntao Yang
Jan 8, 2017·Briefings in Bioinformatics·Marie E BolgerBjörn Usadel
Feb 11, 2016·International Journal of Molecular Sciences·Runtao YangLina Zhang
Nov 21, 2015·Royal Society Open Science·Saikat SarkarRam Mohan Vasu
May 7, 2011·Drug Metabolism and Disposition : the Biological Fate of Chemicals·Jonathan J Sheng, George K Acquaah-Mensah
Feb 6, 2020·Annual Review of Plant Biology·Teresa J ClarkJorg Schwender
Sep 27, 2020·International Journal of Molecular Sciences·Conghui ZhaoJun Duan
Dec 16, 2020·Scientific Reports·Jaesung OhRanjan Swarup

❮ Previous
Next ❯

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.