Presented at the 2000 APIII Conference                        Return to 2000 Abstract Index


UMLS CONCORDANCE FOR HUMAN EMBRYOLOGY

Baltimore VA Medical Center
Baltimore, Maryland
G. William Moore, MD, PhD

Gladys L. G. Alonsozana, MD1,2, G. William Moore, MD, PhD1,2,3, Grover M. Hutchins, MD3

1Departments of Pathology, Baltimore VA Medical Center
2University of Maryland School of Medicine 3The Johns Hopkins Medical Institutions, Baltimore, MD.

Background: The Unified Medical Language System (UMLS) of the U. S. National Library of Medicine is the world's largest system of medical concepts, with over 700,000 concept-unique-identifiers (CUIs) and 1.5 million synonyms in the year 2000 edition. Human embryology concepts are employed in describing the pathogenesis and morphology of many neoplasms and congenital malformations in pathology. We sought to determine the inclusiveness of the UMLS for concepts in human embryology.

Design: The complete text of Streeter's Developmental Horizons in Human Embryos, as well as related texts, was optically scanned and converted into plain-text files. A distribution of single words, as well as multiple-word terms (collocations), was obtained by the Barrier Word Method, a method employed in automated Medline indexing. Exact-matches were made to the synonym-field in the UMLS Metathesaurus, and additional approximate-matches were obtained manually.

Results: The input text was 1.26 MB, consisting of 110,314 words, 9,087 of them distinct. There were 5,323 (4.8%) misspellings, i.e., optical mistranslations. Words ranged in frequency from 10,394 occurrences of 'the', to 4,776 words occurring only once, an average of 12.1 = 110,314/9,087 occurrences per word. There were 401 collocations with exact or approximate UMLS matches. Among correctly spelled words, there were 48,758 (46.4%) exact matches to a UMLS synonym, and 46,250 (44.0%) additional, approximate matches to UMLS CUIs, resulting in 9.5% unmatched concepts.

Conclusion: Results suggest that UMLS is a highly inclusive concept system for human embryology, with 90.5% exactly or approximately matched concepts in classic references in embryology. However, the UMLS is synonym-poor, and many synonyms must be added manually to accommodate embryology free-text. Thus, the UMLS appears to be a sufficiently rich concept system for inter-institutional exchange of embryology data.

Related URL: http://www.netautopsy.org/apep00em.htm