The Anatomic Pathology Laboratory Information and Paraffin Archive as a Tissue Banking Engine.
Ashokkumar A. Patel MD; Case Western Reserve University; Rajnish Gupta MS; Case Western Reserve University ; John Gilbertson MD; Case Western Reserve University ; Robert Lanese MS; Case Comprehensive Cancer Center; John Turnbull PhD; Case Comprehensive Cancer Center;
Content:
Technical advances have allowed increasingly powerful molecular analysis on the formalin fixed, paraffin embedded, clinical tissue samples routinely archived by pathology departments. To further enhance our institutions paraffin archive as a translational research resource, we have endeavored to make our LIS more capable source of clinical/pathologic information. A key part of this initiative has been the restructuring of LIS data to allow specimen based subsections within the large, narrative text fields such as gross description and final diagnosis. This has allowed pathology descriptions and diagnosis to be linked cleanly with histology processing and inventory data at the part, block and slide level; resulting in LIS that can act as a massive tissue banking system, with data associated directly with specimens.
Technology:
CoPath Plus (Cerner DHT), PowerBuilder, XML Data warehouse, Perl (CPAN Engine), Eclipse SDK, and Spring Framework.
Design:
Examination of pathology reports in our LIS system indicated that most reports at our institution, though narrative, followed consistent patterns indicating that legacy reports could be converted to structured data. The narrative text was parsed and transformed into a canonical form and stored in an XML data-warehouse where each section of the pathology report can be further processed. Researchers will then be able to query de-identified data from the data warehouse through a query interface.
Results:
The prototype system includes data from 30,000 pathology reports from 2005 at Case Western Reserve University. The main problem identified was not in narrative data itself, but in the structure of the reports. In cases with more than one part, large narrative contained information on multiple specimens making it hard to reliably associate information with a particular specimen in the LIS. To solve this problem, the narrative text was parsed (repeatedly) resulting in consistent specimen identifiers and this was used to create a XML enforced specimen-based subsection file. This was then was processed for quality control and linked to other specimen-data from histology processing, which enabled the LIS to be much useful as a tissue banking system.
Conclusion:
The system plans to expand its data-set for reports going back far as 1994 (Average 30,000 reports/year). In relative short time frame (6-9 months), development of research database to support tissue banking is possible in the environment of a large academic center by leveraging the LIS. Future work will center on integrating the database with existing clinical systems and experimental data sets at Case Comprehensive Cancer Center.
