2006 Scientific Session Abstracts
E-Poster Presentations
Submission Compliance and Guidelines
Copyright Transfer Form (PDF)
Abstract Archive
geWorkbench: An Open-Source Platform for Integrated Genomics
Andrea Califano PhD; Aris Floratos PhD; Manjunath Kustagi PhD; John Watkinson MS (watkin@genomecenter.columbia.edu), Center for Computation Biology and Bioinformatics, Columbia University, New York, NY
Context: A large number of bioinformatics techniques have been developed to serve the needs of biomedical research. The field is moving rapidly, with new and improved approaches appearing frequently. The fast pace of change and the technical sophistication of these approaches creates a barrier of adoption for ordinary biologists. The problem is exacerbated by the integrative nature of biomedical research which often requires combining data from multiple genomic/biomedical databases and using an array of advanced analysis techniques.
Technology: geWorkbench, the bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic and Cellular Networks (http://magnet.c2b2.columbia.edu), is a stand-alone desktop Java application that provides the user with an integrated suite of genomics tools. It is developed on top of an open-source, extensible framework that enables third parties to contribute improved or alternative tools to the platform.
Design: geWorkbench is a cross-platform Java application framework. Functionality is added to the framework by the addition of pluggable components. Typical components are file format filters, analyses and visualizations. These components are linked by a communication framework for interoperability. Developers can then develop new functionality without wasting repeated effort on tedious tasks such as reading/writing data, application lifecycle, etc. The components can be selectively plugged in to the users’ installations, customizing the application for their specific needs.
Results: Over 40 components have been developed for the framework, covering a wide range of genomics domains. For microarray gene expression analysis, most major file formats and chip types are supported. Many filtering and normalization options are available, and there are links to several annotation sources, including Affymetrix annotations, caBIO pathways and Gene Ontology. Analyses include several differential expression tools, hierarchical clustering, self-organizing maps and regulatory network discovery. Sequence support includes BLAST, pattern discovery, transcription factor mapping, and syntenic region analysis. A wide variety of visualizations accompany these tools. Components to support protein structure visualization and analysis are under development. A complete listing of components as well as detailed project documentation is available at 07/18/2006workbench.org.
The extensibility of the framework allows for the straightforward integration of software from other sources. Many of the components in geWorkbench are the result of integrating quality software from third parties.
Conclusion: geWorkbench enables the delivery of cutting-edge and sophisticated integrated genomics tools to the desktop of regular biologists.
