APIII - Advancing Practice, Instruction & Innovation Through Informatics

Marriott City Center, Pittsburgh, PA | September 20 - 23, 2009

A Grid-Enabled Image Processing Infrastructure for Large Pathology Images

Umit Vedat Catalyurek PhD; Ohio State University; Metin Gurcan PhD; Ohio State University; Olcay Sertel MS; Ohio State University; Joel Saltz PhD; Ohio State University; Vijay Kumar MS; Ohio State University; Ashish Sharma PhD; Ohio State University; Tony Pan MS; Ohio State University; Berkant Barla Cambazoglu PhD; Ohio State University; Jun Kong MS; Ohio State University;

Content:

The size of a pathology image is typically in the order of a few gigabytes while the space requirement for an entire tissue can be in the order of terabytes. This makes processing of such images a computationally challenging problem for todays computers. Furthermore, various pathology workflows may involve transfer of these images or the resulting outputs between multiple, geographically distributed sites. Hence, efficient and secure mechanisms are required for processing and communicating such images. For this purpose, we have designed and developed a standards-based infrastructure utilizing the service-based grid paradigm.

Technology:

This developed infrastructure leverages the caGrid middleware, funded by the National Cancer Institutes Cancer Biomedical Informatics Grid Project.

Design:

The developed infrastructure is capable of interfacing with grid data services, which expose pathology image datasets, and analytical services, which execute image processing algorithms. A pathology data service is implemented in order to manage remote data and algorithm repositories by providing interfaces to traditional database operations. Users can upload their local data/algorithms to a remote grid site, query the remote repository content, and retrieve data/algorithms to their local systems. In this context, an analytical service act as a front-end for a parallel application running Matlab at the back-end, on a high-performance compute cluster. The parallel application is executed utilizing the DataCutter software. The service-based infrastructure provides mechanisms for secure and efficient data transfer between services, allowing complex workflows. Also, a client with a built-in image viewer is developed for interacting with the services.

Results:

This infrastructure has been successfully used in parallel processing of neuroblastoma images. This infrastructure can also be utilized to create complex workflows involving interaction of analytical services and data services. In these workflows, pathology images stored at a data service can be transferred to an analytical service and remotely processed.

Conclusion:

This infrastructure offers three advantages. First, it provides standards-based mechanisms to access remote pathology image databases. Second, the large amount of computational resources and network bandwidth offered by the grid can be utilized. Third, privacy of the patient data is preserved through various authentication and authorization mechanisms implemented in this infrastructure.

Search