2005 Scientific Session Abstracts
Registry Case Finding Engine (CaFE): An informatics Tool to Identify Cancer Patients in Electronic Pathology Reports
David A. Hanauer, MD, MS (hanauer@umich.edu) 1, 4; Arul M. Chinnaiyan, MD, PhD 2, 3, 4
Department of Pediatrics 1, Pathology 2 and Urology 3, Comprehensive Cancer Center 4, University of Michigan Medical School, Ann Arbor, Michigan
Context : Identifying patients with cancer is a vital component to any institution required to maintain a tumor registry. At our institution, over 90% of cases are identified by manually reading printed free-text pathology reports, which is very labor and time intensive.
Technology : A Java Server Pages based application was created to search electronic pathology reports. Three word/phrase lists were created to aid in the search: (1) words/phrases to completely ignore, (2) negative words/phrases of interest to highlight in green, and (3) positive words/phrases of interest to highlight in red.
Design : Over 1,000 pathology reports were manually reviewed by our cancer registry managers and those of interest to the registry were marked. These same reports were re-analyzed using the Registry CaFE. The time required and accuracy of CaFE was then compared to that of manual curation by the registry managers.
Results : The registry managers read a total of 1023 pathology reports and marked 272 (27%) as being of interest. CaFE analyzed the same reports and marked 356 (35%) as being of interest. The sensitivity of the program was 100% and the specificity was 89%. The registry managers required nearly a minute per pathology report, whereas CaFE could annotate each report in a fraction of a second.
Conclusion : A simple text mining algorithm to search pathology reports for specific words/phrases can significantly reduce the time required to identify new registry cases. This informatics tool may have utility to improve the efficiency of tumor registries across the country.
