2006 Scientific Session Abstracts
Scientific Session Presentations & Schedule
E-Poster Presentations
Submission Compliance and Guidelines
Copyright Transfer Form (PDF)
Abstract Archive
Web-based Data Entry and Query Tools to Enhance Translational Research for Colorectal and Pancreatic Neoplasms
Sambit K. Mohanty, MD, DNB, (mohantys2@upmc.edu)1; Ashokkumar A. Patel, MD1; Robert E. Schoen, MD, MPH2; Harpreet Singh, MS1; Rajnish Gupta, MS1; Lynda Dzubinski2, BS; Michael J. Becich, MD, PhD1; Anil V Parwani, MD, PhD1; 1Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania and 2Division of Gastroenterology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania
Content: Colorectal and pancreatic neoplasm virtual biorepository is a developing bioinformatics driven system to integrate data from various clinical, pathologic and molecular systems into one architecture supported by a set of common data elements in order to facilitate basic science, clinical as well translational research. These systems are so designed in order to facilitate semantic and syntactic interoperability in the development of data elements in the form of metadata or data descriptor using controlled vocabulary and ontology, in order to make the data understandable and sharable for end-users and flexible for the system. It also involves creation of data entry, data mining, image storage and analysis tools and a Data-warehouse for genomic and proteomics data sets.
Technology: This is a Web-based data entry and query tool designed as part of the Organ Specific Date-warehouse. The annotation warehouse system is supported in a three-tiered architecture. The mid-tier is implemented Oracle's Application Server (v 9.0.2) on a Compaq DL360 Server running Win2K with SP. The application uses the Oracle http server and mod_plsql extensions to generate dynamic pages from the database to the users. The database is Oracle 9.2.0.1 Enterprise Edition implemented on a SunFire V880 Server running Solaris 2.8.
Design:
1) Development of Common Data Element (CDE):
Common Data Element sets were defined by consensus of the members using the College of American Pathologists (CAP) Protocol and Checklist and North American Association of Central Cancer Registries (NAACR) standards. The elements provide a strong backbone of information for the data base. In addition, a detailed patient clinical history questionnaire was incorporated into the database for capturing standardized, structured data elements.
2) Data Entry Tool:
The oracle based data entry tool is a portable and flexible, easily mastered web-based tool for entering data.
3) Data Query Tool:
Researches can query de-identified data from the warehouse through a point and click query environme07/18/2006 so that two important things occur:
a) the data elements selected for a given query environment are selected from the warehouse and essentially copied into a data mart.
b) the selected data is then transformed from a relational structure to dimensionally modeled structure in a data mart. This allows us to create very efficient and secure specialized query environments for our users. The entire system runs in Oracle, the query front ends are rendered in XML from Java Servlets.
Results: The design of these database are so developed to have normalized subset of standard clinical, pathological, outcomes and molecular descriptors, to enable researchers to utilize the warehouse to optimally utilize the tissue specimens and make those available for use in translational and clinical cancer research. The query tool accesses the central database through a highly constrained “click and point” interface. The data in the warehouse will eventually allow queries on many of the approved data elements. However, the specificity of the data returned will depend on the user’s profile (Public viewer, Approved Investigators and Data Manager). The query tool will allow researchers to tap into a very rich resource of biorepository material for biomarker research.
Conclusion: These biorepositories eventually act as central resources through which researchers can find highly annotated tissue samples. The resource will have no access at any time to patient identified data and tissue will not be made available to researchers without the approval by IRB and Scientific Review Committee. Furthermore by integrating multimodal (pathological, clinical and molecular) data sets in the annotated tissue repository, creating query, visualizing techniques for these diverse data types and by making this information available, users will be provided with much better selected and characterized tissues for research.
