Doublet method for very fast autocoding

BMC Medical Informatics and Decision Making
Jules J Berman

Abstract

Autocoding (or automatic concept indexing) occurs when a software program extracts terms contained within text and maps them to a standard list of concepts contained in a nomenclature. The purpose of autocoding is to provide a way of organizing large documents by the concepts represented in the text. Because textual data accumulates rapidly in biomedical institutions, the computational methods used to autocode text must be very fast. The purpose of this paper is to describe the doublet method, a new algorithm for very fast autocoding. An autocoder was written that transforms plain-text into intercalated word doublets (e.g. "The ciliary body produces aqueous humor" becomes "The ciliary, ciliary body, body produces, produces aqueous, aqueous humor"). Each doublet is checked against an index of doublets extracted from a standard nomenclature. Matching doublets are assigned a numeric code specific for each doublet found in the nomenclature. Text doublets that do not match the index of doublets extracted from the nomenclature are not part of valid nomenclature terms. Runs of matching doublets from text are concatenated and matched against nomenclature terms (also represented as runs of doublets). The doublet autocoder was compared f...Continue Reading

References

Mar 1, 1994·American Journal of Clinical Pathology·G W Moore, J J Berman
Aug 24, 2000·Modern Pathology : an Official Journal of the United States and Canadian Academy of Pathology, Inc·S T Sredni, M C Zerbini
Sep 14, 2000·Journal of the American Medical Informatics Association : JAMIA·W Kim, W J Wilbur
Nov 20, 2002·Journal of Neurology, Neurosurgery, and Psychiatry·M R Macleod
May 14, 2003·Archives of Pathology & Laboratory Medicine·Jules J Berman
Aug 12, 2003·International Journal of Medical Informatics·Gergely Héja, György Surján
Apr 29, 2004·BMC Cancer·Jules J Berman
Jun 3, 2004·Journal of Medical Systems·Y Kagolovsky, J R Moehr
Jun 17, 2004·BMC Medical Informatics and Decision Making·Jules J Berman

❮ Previous
Next ❯

Citations

Oct 3, 2013·BMC Medical Informatics and Decision Making·Andrew J McMurryBen Y Reis
Oct 20, 2005·BMC Medical Informatics and Decision Making·Jules J Berman
Jan 15, 2016·BMC Bioinformatics·Eugene TseytlinRebecca S Jacobson
May 11, 2007·Human Pathology·Thomas A DrakeUNKNOWN Shared Pathology Informatics Network
Jun 7, 2005·Expert Review of Molecular Diagnostics·Jules J Berman, Kishor Bhatia

❮ Previous
Next ❯

Software Mentioned

Neoself
Unix
doubcode
Perl Perl
Active
Linux
Perl
Perl script
Perl scripts
doubcomp

Related Concepts

Related Feeds

Adenosarcoma

Adenosarcoma is a rare tumor found in women and is located in the uterus, but can also arise in the cervix and ovaries. Discover the latest research on Adenosarcoma here.

Related Papers

Studies in Health Technology and Informatics
Joachim Dudeck, Ralf Schweiger
Medical Decision Making : an International Journal of the Society for Medical Decision Making
D A EvansS K Handerson
AMIA ... Annual Symposium Proceedings
C SneidermanBruce Bray
© 2021 Meta ULC. All rights reserved