Apr 28, 2020

L-Regulon: A novel "soft-curation" approach supported by a semantic enriched reading for RegulonDB literature.

BioRxiv : the Preprint Server for Biology
O. W. Lithgow-SerranoJulio Collado-Vides


Manual curation is a bottleneck in the processing of the vast amounts of knowledge present in the scientific literature in order to make such knowledge available in computational resources e.g., structured databases. Furthermore, the extraction of content is by necessity limited to the pre-defined concepts, features and relationships that conform to the model inherent in any knowledgebase. These pre-defined elements contrast with the rich knowledge that natural language is capable of conveying. Here we present a novel experiment of what we call "soft curation" supported by an ad-hoc tuned robust natural language processing development that quantifies semantic similarity across all sentences of a given corpus of literature. This underlying machine supports novel ways to navigate and read within individual papers as well as across papers of a corpus. As a first proof-of-principle experiment, we applied this approach to more than 100 collections of papers, selected from RegulonDB, that support knowledge of the regulation transcription initiation in E. coli K-12, resulting in L-Regulon (L for "linguistic") version 1.0. Furthermore, we have initiated the mapping of RegulonDB curated promoters, promoters, to their evidence sentence i...Continue Reading

  • References
  • Citations


  • We're still populating references for this paper, please check back later.
  • References
  • Citations


  • This paper may not have been cited yet.

Mentioned in this Paper

In Vivo
Myelin Sheath
Theoretical Study
Ion Channel
Enzyme Multiplied Immunoassay Technique

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.

© 2020 Meta ULC. All rights reserved