EARLY DETECTION RESEARCH NETWORK INFORMATION PILOT SYSTEM

Harpreet Singh, MS
John Gilbertson, MD
Center for Pathology Informatics, University of Pittsburgh Medical Center
Pittsburgh, Pennsylvania

Background: The Early Detection Research Network (EDRN) is a national scientific consortium established by the Cancer Biomarkers Research Group, Division of Cancer Prevention, of the National Cancer Institute to develop evaluate, and validate biomarkers for earlier cancer detection and risk assessment. EDRN consists of over 18 Biomarker Development Laboratories, 3 Biomarker Validation Laboratories, 9 Clinical and Epidemiologic Centers and a Data Management and Coordinating Center. In 2001, EDRN began and informatics pilot to share information on tissue samples across the EDRN network. Six sites took part in the project lead by the Fred Hutchinson Cancer Research Center (Seattle WA) and the Jet Propulsion Laboratory (Pasadena CA).

Goal: The goal of the project was to create a secure web site from which EDRN investigators could query tissue sample information from across the EDRN network. De-identified data would reside locally, at the individual sites, yet queries would be developed and reported centrally at the web site. To make the project work, several challenges needed to be overcome including the development of common data elements, tools to map the local data elements to the common data elements, query formation and presentation and communication between the central server and the multiple local databases running on multiple platforms.

Design: The keys to the design were agreement on common data elements, the development of data mapping software to translate local data to the common data elements, and the ability of local sites to maintain their own sites while at the same time be part of a much larger whole. The central component of our infrastructure is a SQL-based central server based at the data management and coordination center (DMCC) in Seattle capable of fetching data from multiple sites servers and responding quickly to queries for data from investigators across the EDRN network. The object oriented data technology (OODT) software for EDRN consists of a series of product servers. A product server is attached to a specific database at a particular site and translates between the central server and the local server. Each product server accepts a query and adds any matching results from its database to the query in a platform independent format. Later, the results are compiled at Central Server and presented to the user via a web interface.

Conclusions: The combination of common data elements, data mapping software and product server software allows for creation of very large information networks from previously unrelated local data systems. There are plans to extend the EDRN network to include all EDRN sites.