2006 Scientific Session Abstracts

 

Convergence of College of American Pathologists (CAP Protocol Model and North American Association of Central Cancer Registry (NAACCR) Elements for the Development and Deployment of Common Data Elements:  An Emerging Standard for Hematology and Colerectal Bio-repository

Sambit K. Mohanty MD, DNB  (mohantys2@upmc.edu);  Ashokkar A. Patel, MD; Harpreet Singh, MS;  Michael J. Becich, MD, PhD;  Anil Parwani, MD, PhD; Department of Biomedical Informatics, University of Pittsburgh Cancer Institute, Pittsburgh, PA

Content: The rise of molecular biology and systems biology in medicine is driving development of bio-repositories that can provide highly annotated tissue samples for translational and clinical research.  Clinical annotation – the association of clinical, pathologic and outcomes data with tissue samples – is central to the success of these repositories as such annotation allows samples can be better matched to the research question at hand and experimental results better understood and verified. To facilitate and standardize clinical annotation in bio-repositories, we have combined two accepted and complementary sets of data standards, the College of American Pathologists (CAP) and World Health Organization (WHO)-compliant synoptic template (for pathology data) and core NAACCR Cancer Registry elements (for disease description, therapy and follow up). Combining and extending these common approaches, - that are or soon will be mandated at most cancer centers - one can create a core set easily implemented of common data elements (CDE) for oncology tissue banking.

Goal: The purpose of the project is to develop a core set of clinical tissue annotation data elements for hematolymphoid, colorectal and pancreatic (exo- as well as endocrine) neoplasms based on the CAP based and WHO - compliant synoptic template and the NAACCR elements. We will associate these elements with a standard vocabulary system and formalize using modeling architecture to enhance both syntactic and semantic interoperability. The system will be implemented in a tissue banking database system.

Technology: We have combined the appropriate synoptic template and NAACCR data elements to form a core set for clinical annotation of hemopoetic and lymphoid neoplasm samples. These elements include information on demographics, clinical history (including therapy), pathology (at both the specimen and block level) and outcome including recurrence and vital status. The system has been implemented in a database application with a three- tiered architecture. The database is Oracle 9.2.0.1 Enterprise Edition on a SunFire V880 Server running Solaris 2.8, the middle tier is the Oracle Application Server (v 9.0.2) on a Compaq DL360 Server running Win2K. The application uses the Oracle http server and mod_plsql extensions to generate dynamic pages from the database to the users.

Results: We have developed the CDEs using vocabulary standards, ontology and semantic modeling methodology. The CDEs included for each case are of five different types that include demographic data, clinical history, pathology data including block level annotation, and clinical outcome data including treatment, recurrence and vital status. These CDEs have been fully implemented in the Tissue Bank.

Conclusions: The synoptic template and the NAACCR core elements represent widely established data elements that are used (and often mandated) in many cancer centers. In this work we have shown the two representations can be combined, formalized and “re-used” to create a core set of clinical annotation for banked hematolymphoid and colorectal and pancreatic neoplasm specimens. Because these data elements are collected as part of the normal workflow of a medical center (and because cancer registries continue to update NAACCR elements over time), data sets developed on the basis of these elements can be easily implemented and maintained. The current set of elements now running in a database system and we are considering mechanisms to establish these elements are an ISO/IED compliant data standard.