APIII - Advancing Practice, Instruction & Innovation Through Informatics

Marriott City Center, Pittsburgh, PA | September 20 - 23, 2009

Attaching metadata to medical documents: a proposed standard using RDF with OpenDocument

Mary F Kennedy MPH; College of American Pathologists; John F Madden MD, PhD; Duke University;

Content:

Computer-processable medical vocabularies are of no practical use without a way to associate the represented metadata to actual medical documents.

Technology:

The OpenDocument format (ODF) is a rich, open-source, XML document file standard maintained by the Organization for the Advancement of Structured Information Standards (OASIS) and widely supported by freeware and commercial office and web-embeddable editing software.

Design:

The OpenDocument Metadata Working group has presented a draft proposal for a flexible and extensible metadata embedding scheme for the upcoming version (v 1.2) of the OpenDocument standard. Metadata will be expressed in Resource Description Framework (RDF), providing a well-characterized semantic model; allowing the document's metadata to reference any entity or concept representable as a URI; especially suitable for incorporating vocabularies expressed in RDF Schema or Web Ontology Language (OWL); and capable of incorporating terms from multiple vocabularies into the same document's metadata.

Results:

Two embedding mechanisms will be available. A lightweight syntax, similar to the XHTML 2.0 RDF attribute syntax, will make RDF triples extractable from Open Document Format (ODF) content by attaching it to defined attributes on existing ODF-XML structural elements. An alternate heavyweight syntax will allow users to attach any number of RDF files to an ODF package, indexed by a metadata manifest file that exhibits the relation of the document metadata to external entities. The latter syntax should be of special interest to medical users, as it enables a form of Named Graph support, allowing users to maintain provenance, trust level or confidentiality of the metadata, and to enumerate alternate, possibly inconsistent interpretations (opinions) about the document's content as separate RDF graphs.

Conclusion:

We will illustrate these capabilities with sample medical documents including surgical pathology reports marked up with standard medical vocabularies including SNOMED CT. We will illustrate how this model allows documents and document sections to carry information about their genre and their contained assertions without specialized hierarchies of concrete XML structural elements that tend to make other proposed medical document formats less usable by non-specialized software.

Search