2005 Scientific Session Abstracts

Region-of-Interest Based Differential Diagnosis Via the Use of Vector Quantization and N-Dimensional Bayesian Voronoi Mapping

Ulysses J. Balis, M.D. (balis@helix.mgh.harvard.edu) Massachusetts General Hospital, Division of Pathology Informatics and Harvard Medical School, Boston, Massachusetts

Context: Recent advances in region-of-interest (ROI) based histopathologic query of established repositories of cataloged image types provides compelling motivation to extend the capabilities of a Vector Quantization (VQ) based inference engine algorithms into a fully instantiated differential diagnosis generator.

Technology: The previously reported vector quantization engine was further developed, using C++ version 6.0 (Microsoft Corporation, Redmond, WA). Microscopic images were obtained from a variety of acquisition sources including wide-field imagers and conventional digital cameras.

Design: 96 cases of 12 classes of differing diagnoses from four organ systems were scanned to allow for the creation of at least 20 representative fields of interest, at 40x magnification. The value of 20 represents an empirically-derived threshold where VQ vector growth typically exhibits asymptotic behavior, thereby representing a near-full capture of the salient feature set. Vectors from each case were globally tagged with the one or more diagnoses originally reported. No effort was made to spatially constrain component diagnoses to constituent vector subsets or to image locales. The average number of vectors constituting a unique diagnostic class was 755,000 with a standard deviation of 120,000.

The resultant vector sets were resorted in N-space to form optimally clustered N-dimensional Voronoi spaces, thus representing a minimal energy configuration for constitutive spatial elements of uniqueness.

Twenty additional cases (the unknown set) were subsequently scanned from regions that were both felt and felt not to be diagnostic of one or more of the 12 initial diagnostic classes. Regions of interest were selected from both diagnostic and non-diagnostic fields and applied to the VQ inference engine for

Identification of closest Bayesian clusters, representing a putative diagnostic match.

Results: Eighteen of the twenty cases exhibited diagnostic concordance when diagnostic ROI's were applied to the VQ engine. The value degraded to five of eighteen cases when non-diagnostic fields were selected for evaluation, representing realization of the null hypothesis. The average number of constitutive vectors that were needed to make a match to the established vocabulary was 73, suggesting that diagnostic uniqueness, from a complexity model, is limited to a relatively small set of morphologic features.

Conclusion: VQ methods are suitable for adaptation to the a priori categorization of histological features. It is anticipated that as vocabulary sets are further refined to diagnoses associated with localized spatial vector sets, and not the global vocabulary (as in this study), it will be possible to generate precise, real-time differential diagnosis based on an interactive digital marquee tool.