Monday, April 23, 2012

Transforming metadata records for small collections into OAIDC for import into DLG union catalog

America's Turning Point is a collaborative project that brings together the Civil War resources of three Georgia institutions--Atlanta History Center, Georgia Historical Society, and the Hargrett Rare Book and Manuscript Library at UGA. Each institution will maintain access its finding aids (which will have links to the scanned documents), but the project will also bring together descriptive records for all three institutions' digitized materials in the DLG union catalog, http://dlg.galileo.usg.edu.

One of the basic requirements of the NHPRC's Digitizing Historical Records program is that funded projects re-use existing descriptive metadata. All three institutions already have EAD-encoded finding aids and MARC. These will be used as the basis for the Dublin Core records required by the DLG union catalog.

Many of the collections to be included in the project are small, consisting of a single folder. Dublin Core records for these collection can be easily created by transforming the MARC records into DC. For those in AT (AHC and Hargrett), we export MARC21slim xml records from AT, run them through a basic clean-up script, and use a modified version of the MARC21slim2OAIDC xslt stylesheet from the Library of Congress.

The basic clean up script takes care of a few simple issues

  • When exporting MARC records from AT, the root element = collection. We want it to be <record>.
  • We create an 856 field and normalize the collection names to correspond with our naming scheme.

The modified XSLT stylesheet adds new fields (<dc:contributor>, <dc:publisher>, <dc:date>, and a <dc:description> with sponsorship details). It also creates <dc:rights>, <dc:source>, and <dc:coverage.temporal> fields. Even after transformation into OAIDC, we do a minor amount of tweaking--adding <dc:coverage.spatial> and correcting punctuation in and .

Once the OAIDC records have been finalized, we use the DLG's importer to add the records to our union catalog. At this time using a local perl, we also create a tab-delimited file to import the digital objects into AT. The script captures the following data:

  • DigitalObjectID
  • dateExpression
  • objectType (provided by user)
  • title
  • uri

No comments: