Nov 13, 2014

ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life

BioRxiv : the Preprint Server for Biology
Evangelos PafilisLars Juhl Jensen

Abstract

Summary The association of organisms to their environments is a key issue in exploring biodiversity patterns. This knowledge has traditionally been scattered, but textual descriptions of taxa and their habitats are now being consolidated in centralized resources. However, structured annotations are needed to facilitate large-scale analyses. Therefore, we developed ENVIRONMENTS, a fast dictionary-based tagger capable of identifying Environment Ontology terms in text. We evaluate the accuracy of the tagger on a new manually curated corpus of 600 Encyclopedia Of Life (EOL) species pages. We use the tagger to associate taxa with environments by tagging EOL text content monthly, and integrate the results into the EOL to disseminate them to a broad audience of users. Availability and implementation The software and the corpus are available under the open-source BSD and the CC-BY-NC-SA 3.0 licenses, respectively, at <http://environments.hcmr.gr> Contact pafilis{at}hcmr.gr; lars.juhl.jensen{at}cpr.ku.dk

  • References
  • Citations

References

  • We're still populating references for this paper, please check back later.
  • References
  • Citations

Citations

  • This paper may not have been cited yet.

Mentioned in this Paper

Computer Software
Patterns
Life Support Procedure
Molecular Probe Techniques
Environment
Gene-Environment Interaction
Body of Uterus
Macrobrachium lar
Description
Species

Related Feeds

BioRxiv & MedRxiv Preprints

BioRxiv and MedRxiv are the preprint servers for biology and health sciences respectively, operated by Cold Spring Harbor Laboratory. Here are the latest preprint articles (which are not peer-reviewed) from BioRxiv and MedRxiv.