|
|||||||||||||||||
|
|
|||||||||||||||||
A Union Database of EAD Finding Aids for Archival Resources
|
|||||||||||||||||
|
|
|||||||||||||||||
III. METHODOLOGY AND STANDARDS |
|||||||||||||||||
A. History and Development of Encoded Archival Description |
|||||||||||||||||
|
Several years ago, the Library at the University of California at Berkeley received funds from the Department of Education to investigate the desirability and feasibility of developing an encoding standard for electronic versions of archival finding aids. The study was inspired by a recognition that archival repositories wished to expand and enhance network access to information about their holdings beyond that available in MARC catalog records, and that efforts to do so would likely be more successful if they were coordinated and standards-based. In consultation with a number of archivists who had expressed an interest in the Berkeley project, Daniel Pitti, principal investigator, identified a number of requirements that would need to be satisfied by any technique used to deliver expanded and enhanced archival description to network users. These include the ability to present extensive and interrelated descriptive information typically found in archival finding aids; the ability to preserve the hierarchical relationships that exist between levels of descriptive detail; the ability to represent descriptive information that is inherited by one hierarchical level from another; the ability to navigate within a hierarchical information architecture; and, the ability to conduct element-specific indexing and retrieval. After experimenting with a number of techniques, Pitti and his colleagues at Berkeley elected to test the use of Standard Generalized Markup Language (SGML) in encoding archival finding aids. This approach had the advantages of using an international standard (ISO 8879); of being able to meet all of the functional requirements of archival finding aids; and of being supported by a large and growing number of software products that run on a variety of computer platforms. Pitti undertook development of a Document Type Definition (DTD, an SGML structural requirement) for finding aids by analyzing numerous examples forwarded to Berkeley from Duke and other institutions who had responded to requests for cooperation. From this analysis he defined a model finding aid structure that formed the basis of his draft DTD for the Berkeley Finding Aid Project (BFAP). Subsequent meetings, most notably the March 1995 Berkeley Finding Aid Conference (supported in part by the Commission on Preservation and Access), led the library and archival profession to the conclusion that SGML encoding of local and networked online finding aids could simplify, improve, and expand access to archival collections by making it possible to link catalog records to finding aids; by enabling searches among pools of networked finding aids; by allowing keyword retrieval to locate folders or items previously buried in hard copy container lists and indexes; and, by creating links to digital surrogates. Further development and refinement of the DTD was undertaken under the auspices of the Bentley Historical Library Research Fellowship Program for Study of Modern Archives, in July 1995. The BFAP model was thoroughly analyzed by a team of experts, chaired by Daniel Pitti, in both archival description and SGML. A new DTD emerged and was christened Encoded Archival Description (EAD). Since that time EAD has continued to evolve, with support from the Library of Congress (LC), the Council on Library Resources, the Society of American Archivists (SAA), the Delmas Foundation, and others. At the end of August 1996, EAD emerged in its full beta release available for extensive testing. On June 23, 1997, nine months after releasing the beta version DTD, the EAD Working Group invited the archival community to submit to the EAD listserv formal comments and suggestions about changes to the beta version DTD. The initial decisions concerning changes were compiled and the EAD Working Group prepared and released on January 30, 1998, two detailed electronic mail messages to the EAD listserv outlining both the changes that it had agreed to incorporate in the next release of the DTD (Version 1.0) and the proposals that it had declined to enact. The rationale for each decision was provided, and reaction from listserv readers was invited. After notifying the archival community of its decisions, the EAD Working Group set about the task of modifying the beta version DTD and totally revising the existing beta tag library to reflect more accurately the proposed Version 1.0 structure. Delays occurred when the group decided that postponing the release of Version 1.0 might permit greater compatibility with the emerging Extensible Markup Language (XML) standard, which was just entering the final stages of development. As a more content-aware language than HTML, XML offers the potential for forthcoming versions of Web browsers like Netscape and Internet Explorer to display EAD-encoded finding aids in their native SGML without requiring helper applications like Panorama. Although parts of XML and its related standards XSL (Extensible Stylesheet Language) and XLL (Extensible Linking Language) still remain unclear, the EAD Working Group decided that XML development had reached sufficient stability to proceed with releasing Version 1.0 of the EAD DTD at the end of August 1998. Accompanying Version 1.0 of the DTD is a completely revised and updated EAD tag library, which was compiled by Working Group members during Spring and early Summer 1998. The validity of EAD as both a general concept and a specific application has been proved through this testing period in numerous projects and other activities related to EAD implementation and education. These have included the NEH-funded American Heritage Virtual Digital Archives Project in which the libraries of the University of California, Berkeley, Duke University, Stanford University, and the University of Virginia worked to develop a shared database of EAD-encoded finding aids; a University of California system-wide project (UC-EAD); a series of EAD workshops developed by the Research Libraries Group (RLG) which have been given by both RLG and the SAA twenty times over the past twenty-one months throughout the United States, Canada, Great Britain, and Australia; and, a number of projects funded through the Library of Congress/Ameritech National Digital Library competition which are using EAD-based descriptive structures for the presentation of digitized primary source materials through LC's American Memory project. In addition, there have been scores of repositories on several continents who have been actively involved in either testing or implementing EAD in various contexts. For more information on EAD see Appendix M: Development of the Encoded Archival Description Document Type Definition and the official EAD website (URL: http://www.loc.gov/ead/). B. Methodology and Standards for the Virginia Heritage Project As the technological leader for the Virginia Heritage Project, the University of Virginia Library will use its long experience in delivering library resources over the Internet to assure that the project follows the best available practices and standards. VIVA and its participating institutions are committed to the use of state-of-the-art, standards-compliant information technology. The finding aids selected for the project will be encoded using version 1.0 (or the version current at the time of the award) of the EAD DTD (URL: http://jefferson.village.virginia.edu/ead/). EAD is widely recognized as the emerging international standard for encoding and delivering finding aids; because it is built from the Standard Generalized Markup Language (SGML), EAD is platform-independent and easy to maintain. The University of Virginia Library has many years of experience in the encoding and delivery of SGML tagged documents both in its Electronic Text Center and its Special Collections Digital Center. (See Appendix A, Sample Finding Aids) In the Virginia Heritage Project, legacy finding aids (i.e. those not in machine-readable form) will be encoded at multiple sites throughout Virginia and contributed to the union database. The project's participants have agreed to adhere strictly to national and international standards. The VIVA Special Collections Committee took the first step in the development of an acceptable range of uniform practice for encoding of both new and retrospectively encoded finding aids on June 3 and 4, 1999; the committee held a workshop, at which Daniel Pitti served as moderator and facilitator, to develop retrospective conversion guidelines based on the American Heritage and Online Archive of California guidelines. The finding aids will be processed both at the University of Virginia center and by other institutions in several different ways. Those finding aids that exist in a machine-readable format will be tagged using a Mu form which maybe accessed by participants at a secured site over the Internet. Mu (named for the Buddhist term for a null state, neither on nor off), is a set of scripts written in the Perl language to help automate SGML markup. Mu was developed and implemented by the Institute for Advanced Technology in the Humanities (IATH), which is based in the University of Virginia Library. The Mu form technology has been used successfully in a number of SGML encoding applications over the past four years. In essence, the Mu form is a fill-in-the-blank template into which data may be entered; Mu provides a regularized structure for applying EAD tags, thus assuring adherence to consortial standards and reducing tagging errors. Mu forms can be customized to work with any type of DTD template. This flexibility will allow the Virginia Heritage Project to develop separate data entry forms for each participating institution that take into account any institutional-specific structure. Mu allows for multi-user editorial communication through the use of separately encoded remarks; thus, encoded EAD files could be transmitted to or from the University of Virginia processing center with questions and comments about the encoding attached to the files. Mu also supports Internet workgroups by the use of lockfiles. That is, each file is automatically locked upon being saved and can only be unlocked by authorized users. The University of Virginia has a Mu form for EAD applications in development. It will be tested by the Special Collections Department in summer 1999 and additionally by the University of Virginia's Law and Health Sciences special collections to determine its effectiveness for use by multiple institutions. Conversion of legacy finding aids to machine-readable form will be done at the University of Virginia processing center by a combination of keyboarding and optical character recognition (OCR) scanning. Based on the University of Virginia's previous experience, this method works well for such a complicated project. Local conversion, as opposed to outsourcing to a data entry firm, allows staff in the processing center to adapt and update guides to meet consortial guidelines. Local conversion also allows staff to contact participating institutions and resolve problems quickly. The union database of finding aids will be delivered using the OpenText search engine which has been licensed by VIVA for use by members of the consortium. OpenText is a powerful full-text search engine that interacts with Perl scripts to convert SGML-encoded documents to Hypertext Markup Language (HTML), the display encoding of the World Wide Web, on the fly. The University of Virginia is also investigating the use of Extensible Markup Language (XML) which is an extremely simple dialect of SGML. XML is rapidly emerging as the new standard for web delivery with the goal of enabling generic SGML to be served, received, and processed on the Web in the way that is now possible with HTML. XML has been designed for ease of implementation and for interoperability with both SGML and HTML. XML operates in conjunction with the Extensible Stylesheet Language (XSL). XSL stylesheets specify the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses the formatting vocabulary. In other words, XSL stylesheets can be developed to render the Virginia Heritage Project finding aids with a specific look and feel; stylesheets can also allow individual institutions to custom display of their own finding aids outside of the union database. |
|||||||||||||||||
|
|
|||||||||||||||||
|
Last Modified: Tuesday, 23-Feb-2010 14:43:48 EST
URL: http://www2.lib.virginia.edu/small/vhp/neh/part3.html Site maintained by UVa Special Collections Department mssbks@virginia.edu
|