The Development of Synoptic Reports from Free Text Content of Archival Pathology Reports in the Anatomic Pathology Laboratory Information System.
John Gilbertson MD; Case Western Reserve University ; Doug Hartman MD; Case Western Reserve University ; Rajnish Gupta MS; Case Western Reserve University ; Robert Lanese MS; Case Comprehensive Cancer Center; John Turnbull PhD; Case Comprehensive Cancer Center; Ashokkumar A. Patel MD; Case Western Reserve University ;
Content:
Current efforts are underway to improve the structure of the information in archival pathology reports for the purpose of creating computer queries for information used in cancer research. Synoptic reporting is one such structured approach to achieve this objective. There is, however, a large body of archived reports in free text that will never be manually transformed into synoptic forms. This study will target defined cohorts of legacy cases by comparing the use of focused structured data re-entry to the ability of a computerize system to extract key data items for cancer checklists.
Technology:
SNOMED codes for vocabulary. College of American Pathologists (CAP) Cancer Checklists. PERL tools(CPAN Engine), xPert Pathology (mTuitive, Centerville, MA), CoPath Plus (Cerner DHT, Waltham, Mass).
Design:
We have investigated automated methods to create synoptic reports from free text archival reports. And though they are not perfect, they are useful in greatly narrowing the search for specific information related to cancer cases. The methods used a combination of searching for key words as well as using the SNOMED codes. We have scored specimen type, tumor site, tumor size, histologic type, and histologic grading. Initial phase of this project will focus only on data represented in pathology reports for primary pancreatic cancer cases and the synoptic data represented in the CAP pancreas cancer checklists.
Results:
We have developed and tested a computer system that processed all 29747 pathology reports from 2005 at Case Western Reserve University and automatically selected out those with primary pancreatic cancer (23 in all). The correct scoring is as follows: specimen type (13/23 -- 56%), tumor site (13/23 -- 56%), tumor size (14/23 -- 61%), histologic type (18/23 -- 78%), and histologic grading (20/23 87%). Cases that were missed by the automated scoring were due to transcriptional and/or formatting errors as well as lack of documented information of key data elements within the pathology report will also be discussed.
Conclusion:
We have tabulated the results of individual scores and discuss the effect of different coding vocabularies and other strategies. We will further process all 2005 reports with all of the other existing CAP cancer checklists.
