14. Preservation metadata

In its broadest sense, preservation metadata could include any contextual information required to provide sustainable access to content. In addition to technical requirements, this might include information required to authenticate the content for example. In this broad sense then, preservation metadata should contain full details about:

  • any non-file-based carriers the content has been held on, including their condition
  • the replay equipment used in the transfer process, and its parameters
  • the capture equipment used, including known rendering software
  • format information on the resultant file, including the digital resolution
  • the operators involved in the process
  • checksum – the digital signature that permits authentication of the file
  • details of any secondary information sources.

In practice, metadata is often separated into categories including descriptive, administrative structural and preservation metadata. Preservation metadata in this specific sense is mandatory to evaluate the technical parameters of a recording, and to draw appropriate conclusions for the management of preservation. A subset of preservation metadata, namely the metadata necessary to faithfully render the primary information, may be considered an indispensable part of an AV document.

It is strongly recommended that metadata be written according to established standards, in as consistent a fashion as possible. Writing metadata in a machine-actionable form (for example using XML schemas) has the further significant advantage of enabling automation of certain preservation and dissemination actions.

Comment:

Metadata, often described as “data about data” is, in the digital environment, a detailed and specific extension of cataloguing practice. However, when associated with digital collections, it is a necessary part of their use and control. A Preservation Metadata Set is a statement of the information required to manage preservation of digital collections. Preservation metadata will be a key component in the preservation and management of any digital collection and must be designed to support future preservation strategies. A vital component of preservation metadata is the checksum or digest of a file, which is essential in monitoring data integrity and verifying authenticity. As such, it may be compared to the fingerprint of a given file.

The most thorough articulation of preservation metadata is represented by PREMIS (http://www.loc.gov/standards/premis/), the product of an international working group active from 2003–2005, and subsequently updated and revised by members of the digital library community. PREMIS is conceptualised around four categories: the Object, Event, Agent and Rights.

The Object entity pertains to what is stored and managed in the preservation repository.

The Event entity aggregates information about actions that affect objects in the repository, vital for maintaining the digital provenance of an object, which in turn is important in demonstrating the authenticity of the object.

Agents are actors that have roles in events and in rights statements and they can be people, organisations, or software applications.

Issues pertaining to rights or other restrictions arise not only when providing access to content but also when preserving it, since most preservation strategies involve making identical copies and derivative versions of digital objects, actions that may be limited by copyright law or by other restrictions, e.g., requirements imposed by donors. PREMIS rights metadata aggregates information about restrictions that are directly relevant to preserving objects in the repository.

Metadata can be stored with the resource it describes (e.g., within file formats that support descriptive headers or file wrappers), separate from the resource (e.g., within an external catalogue) or separate but linked to the resource (e.g., a file linked to the digital object in a repository structure). Each strategy has particular benefits and disadvantages. It is possible, and probably desirable, to use these strategies in parallel. The use of standardised wrappers is emerging as a trend in digital preservation of audiovisual material, because of their ability to handle file relationships. Wrappers also allow the possibility of retaining all of a file’s primary information within the digital object.