Statistical principle-based approach for recognizing and normalizing microRNAs described in scientific literature

Database : the Journal of Biological Databases and Curation
Hong-Jie DaiWen-Lian Hsu

Abstract

The detection of MicroRNA (miRNA) mentions in scientific literature facilitates researchers with the ability to find relevant and appropriate literature based on queries formulated using miRNA information. Considering most published biological studies elaborated on signal transduction pathways or genetic regulatory information in the form of figure captions, the extraction of miRNA from both the main content and figure captions of a manuscript is useful in aggregate analysis and comparative analysis of the studies published. In this study, we present a statistical principle-based miRNA recognition and normalization method to identify miRNAs and link them to the identifiers in the Rfam database. As one of the core components in the text mining pipeline of the database miRTarBase, the proposed method combined the advantages of previous works relying on pattern, dictionary and supervised learning and provided an integrated solution for the problem of miRNA identification. Furthermore, the knowledge learned from the training data was organized in a human-interpretable manner to understand the reason why the system considers a span of text as a miRNA mention, and the represented knowledge can be further complemented by domain expert...Continue Reading

References

Feb 20, 2003·RNA·Victor AmbrosThomas Tuschl
Jan 8, 2004·Genome Biology·Anton J EnrightDebora S Marks
Apr 10, 2004·Bioinformatics·L SmithW J Wilbur
Apr 5, 2005·Nature Genetics·Azra KrekNikolaus Rajewsky
Aug 17, 2005·Cancer Research·Marilena V IorioCarlo M Croce
Oct 18, 2008·Nucleic Acids Research·Qinghua JiangYunlong Liu
Mar 6, 2010·Current Protocols in Bioinformatics·Sam Griffiths-Jones
Jun 12, 2010·Molecular BioSystems·B Stuart MurrayWei Liu
May 25, 2011·Journal of Biomedical Informatics·Harsh DweepNorbert Gretz
Sep 13, 2011·Nature Structural & Molecular Biology·David M GarciaDavid P Bartel
Sep 21, 2013·Database : the Journal of Biological Databases and Curation·Donald C ComeauW John Wilbur
Sep 26, 2015·PLoS Computational Biology·Gang LiK Vijay-Shanker
Jan 1, 2014·F1000Research·Shweta BagewadiRoman Klinger
Jan 1, 2014·F1000Research·Shweta BagewadiRoman Klinger
Nov 22, 2015·Nucleic Acids Research·Chih-Hung ChouHsien-Da Huang
May 25, 2016·Journal of Biomedical Semantics·Samir GuptaK Vijay-Shanker
Jun 1, 2016·Database : the Journal of Biological Databases and Curation·Hong-Jie DaiWen-Lian Hsu
Aug 10, 2016·Database : the Journal of Biological Databases and Curation·Jitendra JonnagaddalaHong-Jie Dai
Jun 13, 2017·Database : the Journal of Biological Databases and Curation·Yalbi Itzel Balderas-MartínezAnnie Pardo

❮ Previous
Next ❯

Software Mentioned

TBL
BioCreative
BeCalm
MiRNA
InfoMap
Information Map ( InfoMap )
SPBA
MedPost
miRWalk

Related Concepts

Trending Feeds

COVID-19

Coronaviruses encompass a large family of viruses that cause the common cold as well as more serious diseases, such as the ongoing outbreak of coronavirus disease 2019 (COVID-19; formally known as 2019-nCoV). Coronaviruses can spread from animals to humans; symptoms include fever, cough, shortness of breath, and breathing difficulties; in more severe cases, infection can lead to death. This feed covers recent research on COVID-19.

Blastomycosis

Blastomycosis fungal infections spread through inhaling Blastomyces dermatitidis spores. Discover the latest research on blastomycosis fungal infections here.

Nuclear Pore Complex in ALS/FTD

Alterations in nucleocytoplasmic transport, controlled by the nuclear pore complex, may be involved in the pathomechanism underlying multiple neurodegenerative diseases including Amyotrophic Lateral Sclerosis and Frontotemporal Dementia. Here is the latest research on the nuclear pore complex in ALS and FTD.

Applications of Molecular Barcoding

The concept of molecular barcoding is that each original DNA or RNA molecule is attached to a unique sequence barcode. Sequence reads having different barcodes represent different original molecules, while sequence reads having the same barcode are results of PCR duplication from one original molecule. Discover the latest research on molecular barcoding here.

Chronic Fatigue Syndrome

Chronic fatigue syndrome is a disease characterized by unexplained disabling fatigue; the pathology of which is incompletely understood. Discover the latest research on chronic fatigue syndrome here.

Evolution of Pluripotency

Pluripotency refers to the ability of a cell to develop into three primary germ cell layers of the embryo. This feed focuses on the mechanisms that underlie the evolution of pluripotency. Here is the latest research.

Position Effect Variegation

Position Effect Variagation occurs when a gene is inactivated due to its positioning near heterochromatic regions within a chromosome. Discover the latest research on Position Effect Variagation here.

STING Receptor Agonists

Stimulator of IFN genes (STING) are a group of transmembrane proteins that are involved in the induction of type I interferon that is important in the innate immune response. The stimulation of STING has been an active area of research in the treatment of cancer and infectious diseases. Here is the latest research on STING receptor agonists.

Microbicide

Microbicides are products that can be applied to vaginal or rectal mucosal surfaces with the goal of preventing, or at least significantly reducing, the transmission of sexually transmitted infections. Here is the latest research on microbicides.

Related Papers

Database : the Journal of Biological Databases and Curation
Hong-Jie Dai, Onkar Singh
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Karin VerspoorLawrence E Hunter
© 2022 Meta ULC. All rights reserved