Presented at the 2000 APIII Conference Return to 2000 Abstract Index
UMLS CONCORDANCE FOR A COMPREHENSIVE PATHOLOGY TEXT
Baltimore
VA Medical Center
Baltimore, Maryland
G. William Moore,
MD, PhD
John H. Sinard, MD, PhD1, G. William Moore, MD, PhD2,3,4
1Departments
of Pathology, Yale University School of Medicine, New Haven,
CT
2Baltimore VA Medical Center
3University of Maryland School of Medicine
4The Johns Hopkins Medical Institutions, Baltimore, MD
Background: The Unified Medical Language System (UMLS) of the U. S. National Library of Medicine is the world's largest system of medical concepts, with over 700,000 concept-unique-identifers (CUIs), 1.5 million synonyms, and partial translations into over twenty languages in the year 2000 edition. We sought to determine the inclusiveness of the UMLS for concepts in pathology by building a concordance to a popular review text in pathology.
Design: A distribution of single words, as well as multiple-word terms (collocations), was obtained for the electronic text of Sinard's Outlines in Pathology by the Barrier Word Method, a method employed in automated Medline indexing. Exact matches were made to the synonym-field in the UMLS Metathesaurus, and additional approximate matches were obtained manually.
Results: The input text was 951 KB, with 120,677 words, 11,240 of them distinct, ranging in frequency from 4,037 occurrences of 'of', to 4,512 words occurring only once, an average of 10.7 = 120,677/11,240 occurrences per word. There were 3,520 distinct collocations with exact or approximate UMLS matches. There were 77,498 (64.2%) exact matches to a UMLS synonym, and 33,348 (27.6%) additional, approximate matches to UMLS CUIs, resulting in 8.1% unmatched concepts.
Conclusion: Results suggest that UMLS is a highly inclusive concept system for human pathology, with 91.9% exactly or approximately matched concepts in a comprehensive pathology outline. However, the UMLS is synonym-poor, and many synonyms must be added manually to accommodate pathology free-text. Thus, the UMLS appears to be a sufficiently rich concept system for inter-institutional exchange of pathology data.
Related URL: http://www.netautopsy.org/apep00op.htm
