APIII - Advancing Practice, Instruction & Innovation Through Informatics

Marriott City Center, Pittsburgh, PA | September 20 - 23, 2009

Presented at the 1999 APIII Conference                        Return to 1999 Abstract Index


TURKISH LANGUAGE ANNOTATION OF AN INTERNET PATHOLOGY IMAGE ARCHIVE

Baltimore Veteran's Administration Medical Center
Pathology and Laboratory Medicine Service
Baltimore, Maryland
G. William Moore, MD, PhD

G. William Moore, MD, PhD1,2,3; Enver Vardar, MD4; Yener S. Erozan, MD3; Fatih Durmusoglu, MD5
1Pathology and Laboratory Medicine Service, Veterans Affairs Maryland Health Care System, Baltimore, Maryland
2Department of Pathology, University of Maryland School of Medicine, Baltimore, Maryland
3Department of Pathology, The Johns Hopkins Medical Institutions, Baltimore, Maryland
4Department of Pathology, Sosyal Sigortalar Kurumu, Izmir, Turkey
5Department of Obstetrics and Gynecology, Marmora University School of Medicine, Istanbul, Turkey

Background: Anatomic pathology images in a large archive must be recoverable both by pathologic diagnosis and by descriptive content. The Image Archive of The Johns Hopkins Autopsy Resource website (JHAR-IA) consists of over five thousand uncopyrighted anatomic pathology images from the Armed Forces Institute of Pathology Electronic Fascicles (AFIP-EF). The images have been computer-indexed in the Unified Medical Language System (UMLS), based upon corresponding English-language image-legends. For Turkish speakers who use English as a second language, it is helpful to annotate these image-legends in Turkish, so that images may be recalled by Turkish keywords.

Design: All words and UMLS concepts in the pathology image-legends of the AFIP-EF posted on the JHAR-IA were pointed to Turkish words or phrases. Turkish is an Altaic language, linguistically unrelated to English, but displayed in the Roman alphabet with six special characters. Simple noun-phrases, e.g., CARCINOMA OF KIDNEY, were translated into Turkish with appropriate word-rearrangements and noun-inflections, corresponding to rules of Turkish syntax. Indexing software was written in M-language (formerly, MUMPS), and display software was written in the Practical Extraction and Reporting Language (PERL).

Results: There were 5,465 pathology images posted on the JHAR-IA, with image-legends containing 5,364 distinct words and pointing to 3,016 distinct UMLS concepts, ranging in frequency from 5,465 occurrences of four UMLS terms to one occurrence apiece of 875 UMLS terms. Each word from the image-legends was translated as a Turkish annotation. There were 1,992 UMLS terms (66%) that were noun-phrases, prepositional phrases, or other elementary grammatical constructions that could be computer-translated into grammatically correct Turkish.

Conclusion: English is the dominant language of the Internet, but non-native English speakers may by assisted in finding images based upon non-English keywords. The Johns Hopkins Autopsy Resource Image Archive website may be queried on the Internet with either English or Turkish query-words, and bilingual annotations.

Related URL : http://www.netautopsy.org/apep99tk.htm

Search