Extraction of Protein-Protein Interaction from Scientific Articles by Predicting Dominant Keywords

BioMed Research International
Shun KoyabuTakenao Ohkawa

Abstract

For the automatic extraction of protein-protein interaction information from scientific articles, a machine learning approach is useful. The classifier is generated from training data represented using several features to decide whether a protein pair in each sentence has an interaction. Such a specific keyword that is directly related to interaction as "bind" or "interact" plays an important role for training classifiers. We call it a dominant keyword that affects the capability of the classifier. Although it is important to identify the dominant keywords, whether a keyword is dominant depends on the context in which it occurs. Therefore, we propose a method for predicting whether a keyword is dominant for each instance. In this method, a keyword that derives imbalanced classification results is tentatively assumed to be a dominant keyword initially. Then the classifiers are separately trained from the instance with and without the assumed dominant keywords. The validity of the assumed dominant keyword is evaluated based on the classification results of the generated classifiers. The assumption is updated by the evaluation result. Repeating this process increases the prediction accuracy of the dominant keyword. Our experimenta...Continue Reading

Citations

Apr 21, 2018·Integrative Biology : Quantitative Biosciences From Nano to Macro·Bianca K StöckerSven Rahmann
Mar 7, 2018·BMC Bioinformatics·Varsha D BadalIlya A Vakser
Jun 28, 2021·Journal of Translational Medicine·Chris BauerJohannes Schuchhardt

❮ Previous
Next ❯

Methods Mentioned

BETA
feature extraction

Software Mentioned

AImed
DK
BioInfer
MC

Related Concepts

Related Feeds

Bioinformatics in Biomedicine

Bioinformatics in biomedicine incorporates computer science, biology, chemistry, medicine, mathematics and statistics. Discover the latest research on bioinformatics in biomedicine here.