A-1.7 Standards

Cite Permalink:

A-1.7.1 Item level and collection level metadata

Cite Permalink:
Item-level metadata applies to an individual item, and should be associated with and linked to the item it describes. Most metadata is applied to individual items. Dublin Core [i] (DC) is one of the simplest and most widely used metadata schemas.
Cite Permalink:
There is no dominant metadata standard for describing collections, although in the last few years there has been substantial progress towards this goal [ii]. NISO has also recently released a set of guidelines for building good digital collections [iii]. UKOLN’s Collection Description Focus offers advice in this area, including a tutorial which points to the role of schemas [iv].
Cite Permalink:

A-1.7.2 Technical and administrative metadata

Cite Permalink:
Technical and administrative metadata are used to facilitate management, tracking, migration and re-use of digital assets. They typically include information on creation, quality control, rights and preservation. Some of the information may be harvested from the file itself while other information will need to be provided by the institution managing the image capture process. Administrative metadata can include critical information with respect to the conditions at the time of the digital capture; this may allow for some element of quality control.
Cite Permalink:

A-1.7.3 Image metadata

Cite Permalink:

A- Images Application Profile

Cite Permalink:
The JISC-funded Images Application Profile [v] (IAP) is a Dublin Core Application Profile for describing images held in institutional repositories. The application profile is based on the IAP model, which was based on the Functional Requirements for Bibliographic Records (FRBR) [vi] entity-relationship model.
Cite Permalink:

A- DIG35 Metadata Standard for Digital Images

Cite Permalink:
The overall goal of the DIG35 initiative was to define a standard set of metadata for digital images that improves the semantic interoperability between devices, services and software. In August 2000 the DIG35 Metadata Specification [vii] was released providing a consistent set of metadata definitions to the imaging industry. The standard set of metadata can be widely implemented across multiple image file formats and provides a uniform underlying construct to support semantic interoperability of metadata between various digital imaging devices. The DIG35 Initiative also educates the industry regarding the importance of metadata usage, preservation and exchangeability.
Cite Permalink:

A- VRA Core

Cite Permalink:
Developed by the US Visual Resources Association, the VRA Core [viii] is a widely used metadata schema for describing images, particularly art or cultural images. VRA Core has been much influenced by the CDWA standard and by DC. The formal definition of VRA Core includes a mapping of its categories to DC.
Cite Permalink:


Cite Permalink:
NISO Metadata for Images in XML [ix] (MIX) is the generally accepted standard for still images. MIX is a version in XML of an extensive set of elements (the NISO Data Dictionary [x]) devised by the National Information Standards Organisation (NISO) for the detailed technical description of still images. The range of information that can be encoded in MIX is very large, from basic information on file types and sizes, to details of image capture (including capture hardware and image targets), to details of how an image has been processed after capture. Although a MIX file can be very lengthy and complex, almost all of its components (more than in its parent element set) are optional so that a basic record may be very simple. Although still in the process of drafting (version 0.2 currently), MIX has already established itself as the key standard for this type of metadata.
Cite Permalink:
The NISO Data Dictionary has been designed to facilitate interoperability between systems, services, and software as well as to support the long-term management of and continuing access to digital image collections. The standard only applies to still raster (bitmap) images and does not address other image formats, such as vector or moving picture. The standard can be applied to digital images created through digital photography or scanning, as well as those that have been altered through editing or migration (image transformation).
Cite Permalink:
The Depot support service at EDINA will be applying MIX to provide technical and process metadata for still images. Information from TIFF file headers can be harvested and exported to relevant fields in XML output. Process metadata relating to the cropping & cleaning of images can also be recorded.
Cite Permalink:

A-1.7.4 Multimedia metadata

Cite Permalink:
A multimedia presentation is a structured collection of elements, such as video and audio clips, images, and documents. The packaging of these elements is represented by means of multimedia metadata. Some usual approaches include METS [xi], IMS Content Packaging [xii] and SCORM [xiii], but the main standard for this purpose is MPEG-21 DID. It defines an XML representation of complex digital objects, based on a data model that defines the container structure of the multimedia packages. SMIL is the W3C standard designed to describe synchronized multimedia compositions.
Cite Permalink:
For the technical and administrative metadata associated with audio and video files, there is no comparable standard to MIX. Several content related standards exist but none allow the homogenous description of multimedia content, service and context of use.
Cite Permalink:

A- Time-based Media Application Profile

Cite Permalink:
The JISC-funded Time-based Media Application Profile to Support Search & Discovery project scoped the area of time-based media in tertiary education to produce a prototype exemplar Application Profile [xiv] to support search, discovery and re-use of Time-based Media in this domain. The time-based media model was based on the Functional Requirements for Bibliographic Records (FRBR) [xv] entity-relationship model.
Cite Permalink:


Cite Permalink:
The Audio Technical Metadata Extension Schema [xvi] includes all key information necessary to make sense of an audio file (including, for instance, its format, bit rates, sampling frequencies, and any compression applied to it). The schema is of minimal size and complexity but functional.
Cite Permalink:


Cite Permalink:
Video Technical Metadata Extension Schema [xvii] is a set of 36 elements designed for the Library of Congress’s digital library projects. It concentrates solely on technical metadata and so avoids any potential problems of overlap with schemes for other types of metadata.
Cite Permalink:

A- PBCore

Cite Permalink:
A popular schema for video in the US is PBCore [xviii], produced by public broadcast television services. This includes a set of elements for the technical description of the digital video file itself, including details of file formats, encoding, duration, aspect ratios and details of changes made as it is processed after creation. However, it also includes elements for descriptive metadata and intellectual property information, which may overlap with other schemas such as MODS or PREMIS.
Cite Permalink:

A-1.7.5 Structural metadata

Cite Permalink:
Structural metadata describes the internal structure of digital resources and the relationships between their parts. It is used to enable navigation and presentation.
Cite Permalink:


Cite Permalink:
The METS schema is a de facto standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library expressed using an XML Schema. METS is an established standard for structuring complex digital resources (e.g. publications with multiple pages) and wrapping other sets of metadata. It is often used with MODS.
Cite Permalink:
METS defines resource aggregation as: ‘The process of gathering together resources; the results can be used in one or more applications, such as transmission, storage, and delivery to users’ (L’Hours 2007) with 3 standards:
Cite Permalink:
  • Dublin Core Abstract Model.
  • MPEG-21 DIDL.
  • IMS-CP.
Cite Permalink:
Use of METS has a number of benefits, which were outlined by the Paradigm [xix] project as follows:
Cite Permalink:
  • METS is maintained by the Library of Congress and is non-proprietary.
  • Any system capable of handling XML documents can be used to create, store and deliver a METS file, thereby mitigating problems of software obsolescence.
  • It is written in XML, which is robust as an archival medium; also it is an XML Schema extensible to future additions and supporting the use of multiple XML namespaces, which allow different kinds of metadata to be encoded in the same document.
  • METS has the ability to deal with a wide variety of materials, including large and complex digital objects.
  • The possibility of creating multiple structural maps in a METS document means that archivists can also take advantage of its capacity for sorting and reordering records in varied ways for researcher access.
Cite Permalink:
However, METS also has a number of weaknesses, also outlined by the Paradigm project, and commented upon recently in the context of the British Library experience [xx] of using METS:
Cite Permalink:
  • The flexibility of METS can raise interoperability problems. It does not ensure standardisation because it does not operate as a metadata standard, rather as a framework within which metadata can be stored.
  • Some metadata is difficult to use without bespoke development.
  • Whilst using METS Profiles can mitigate these problems and facilitate manual cross-mapping, this still does not allow the automatic transfer of files between systems.
  • At present METS documents largely have to be generated manually.
  • METS relies on the effective use of unique identifiers. This can be difficult to administer.
Cite Permalink:


Cite Permalink:
The MPEG-21 vision [xxi] is to define a multimedia framework to enable use of multimedia resources across a wide range of networks and devices used by the different communities. Similar to METS, the de facto standard MPEG-DIDL (Digital Item Declaration Language) has also been proposed as a suitable vehicle to support transfer and dissemination of complex objects for preservation by an external service provider. The MPEG-21 standard only concerns multimedia content (MPEG21-DIDL) and context description (MPEG-21 DIA) and so does not allow the description of aggregated composition of content and services.
Cite Permalink:

A-1.7.6 Preservation metadata

Cite Permalink:
PREMIS is the de facto standard metadata schema for preservation metadata. PREMIS was developed by an OCLC/RLG working group and is being maintained via the Library of Congress [xxii]. It was particularly influenced by the Open Archival Information System (OAIS), which provides a framework for the long-term preservation of digital (and non-digital) resources.
Cite Permalink:
The PREMIS data dictionary specifies core metadata elements used to support the preservation of digital resources. It can be used for verifying and tracking the provenance, and checking the authenticity and integrity of preserved digital assets. It is being widely adopted as a means of recording information to support the preservation of digital resources. However, PREMIS is currently still undergoing a period of trial use.
Cite Permalink:

A-1.7.7 Standards for interoperability

Cite Permalink:
An important reason for choosing a standard metadata schema is to be able to interoperate with other collections.
Cite Permalink:


Cite Permalink:
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is the key standard being used to achieve interoperability through harvesting metadata and is used extensively.
Cite Permalink:
OAI-PMH requires generation of XML records with metadata encoded as simple DC (although it is possible to include other schemas as well). These records are placed in a public space on a server where they are available for others to harvest (via the OAI protocol). They can then be incorporated into catalogues or directories. Although the OAI records represent a simplified version of any richer metadata, users can link through to the original collection to view the digital resource. Once there, they will be exposed to the full metadata record and will be able to see the item in context. A practical example of this is the OAIster [xxiii] project at the University of Michigan. OAIster has harvested almost 10 million OAI-PMH records from 700 institutions.
Cite Permalink:
Recently there has been a certain amount of scepticism expressed about the worth of the OAI-PMH protocol [xxiv]. Amongst cons mentioned on mailing lists are that OAI-PMH is not suitable to transfer large-size files, especially when a network’s bandwidth is not large. OAI also has limitations when used for images and museum objects, which has implications for harvesting metadata, searching, and disclosure. An important outcome of the FAIR programme [xxv] was a set of discussion papers exploring these issues. The conclusion to this work was that the use of collection description may be more appropriate for some museum objects, as item descriptions maybe too similar to be useful.
Cite Permalink:
As with METS, it may require quite a lot of work to generate the OAI records, although collection management software (e.g. digital repository systems) is incorporating OAI functionality. The JISC Technical Standards specify that service components must use OAI-PMH version 2.0 for metadata harvesting.
Cite Permalink:

A- Z39.50

Cite Permalink:
Z39.50 is a standard communications protocol for the search and retrieval of bibliographic metadata in online databases. Z39.50 is used on the Internet to search Copac, the UK Online Public Access Catalogue [xxvi] of library holdings. The Zetoc [xxvii] electronic table of contents service Z39.50 client is based on the same technology as the Copac client. SUNCAT, the Serials UNion CATalogue [xxviii], offers a downloading service for its users, which includes a Z39:50 connection that allows records to be downloaded in SUTRS and XML, and also in MARC format for contributing libraries [xxix].
Cite Permalink:
The Z39.50 protocol is commonly used for cross-system search. Z39.50 implementers do not share metadata but map their own search capabilities to a common set of search attributes. A contrasting approach taken by the Open Archives Initiative is for all metadata providers to translate their native metadata to a common core set of elements and expose this for harvesting. A search service provider then gathers the metadata into a consistent central index to allow cross-repository searching regardless of the metadata formats used by participating repositories.
Cite Permalink:

A- Syndication Formats

Cite Permalink:
The web has become enormous, and search engines that crawl the surface of the web are picking up only a small fraction of the available content. Further, some of the richest and most interesting content cannot be crawled and indexed by one search engine or navigated by one relevancy algorithm [xxx]. RSS and ATOM are lightweight options for syndicating metadata compared to other methods of exposing or exchanging metadata, such as OAI-PMH, or returning the results of various queries. Both RSS and ATOM can also be used for providing updates from frequently updated data services such as institutional and open access repositories to the user.
Cite Permalink:


Cite Permalink:
RSS (Rich Site Summary or Real Simple Syndication
[xxxi]) is a format for delivering regularly changing web content. Many news-related sites, weblogs and other online publishers syndicate their content as an RSS Feed to whoever wants it. RSS solves a problem for people who regularly use the web. It allows a user to stay informed by retrieving the latest content from the sites they are interested in, and so users save time by not needing to visit each site individually. Privacy is ensured by not needing to join each website’s email newsletter.
Cite Permalink:
One RSS Feed Issue is that RSS is not, unfortunately, a single standard and there are a number of versions of RSS available for use. Other issues [xxxii] surrounding syndicated feed deposit into institutional repositories have been identified by the Jorum team. ATOM [xxxiii] is also written in XML and was created as a response to perceived deficiencies of RSS 2.0 and the ‘version wars’ in the RSS community [xxxiv]. It uses unique identifiers and is validated by a schema.
Cite Permalink:
Although often commonly perceived and presented as information being pushed to the end-user, RSS and ATOM represent an alternative means through which repositories can present their metadata for exposure: the RSS reader then aggregates the metadata by pulling it from the repositories. The harvesting process assumes metadata will be harvested unless it is specifically withheld. RSS/ATOM readers select what feeds they wish to receive and aggregate what they are given. Exposing metadata through RSS and ATOM can be considered a more controlled way of exposing metadata for aggregation elsewhere.
Cite Permalink:
The JISC Linking UK Repositories scoping study report [xxxv] concluded that RSS and ATOM should be investigated as additional standards to OAI-PMH for use in aggregating metadata and content as they offer the potential of targeted exposure of repository resources that may be beneficial in the development of end-user services targeted at specific communities. Chris Awre [xxxvi] recently commented that the conclusions of this report are still valid.
Cite Permalink:

A- OpenSearch

Cite Permalink:
Different types of content require different types of search engines. The best search engine for a particular type of content is likely to be the search engine written by the people that know the content the best. OpenSearch helps search engines and search clients communicate by introducing a common set of formats to perform search requests and syndicate search results. OpenSearch was created by A9.com, an Amazon.com company, and the OpenSearch format is now in use by hundreds of search engines and search applications around the Internet. The OpenSearch specification is made available according to the terms of a Creative Commons license.
Cite Permalink:
[i] http://dublincore.org/
[ii] http://imlsdcc.grainger.uiuc.edu/resources.asp
[iii] http://framework.niso.org/node/5
[iv] http://www.ukoln.ac.uk/cd-focus/cdfocus-tutorial/schemas/
[v] http://www.ukoln.ac.uk/repositories/digirep/index/The_Images_Application_Profile
[vi] http://www.ifla.org/en/publications/functional-requirements-for-bibliographic-records
[vii] http://xml.coverpages.org/FU-Berlin-DIG35-v10-Sept00.pdf
[viii] http://www.vraweb.org/projects/vracore4/index.html
[ix] http://www.loc.gov/standards/mix/
[x] http://www.niso.org/kst/reports/standards?step=2&gid=None&project_key=b897b0cf3e2ee526252d9f830207b3cc9f3b6c2c
[xi] Metadata Encoding & Transmission Standard. The Library of Congress. http://www.loc.gov/standards/mets
[xii] IMS Content Packaging XML Binding. IMS Global Learning consortium. http://www.imsproject.org/
[xiii] Sharable Content Object Reference Model (SCORM). Advanced Distributed Learning. http://www.adlnet.org/
[xiv] http://wiki.manchester.ac.uk/tbmap/index.php/Project_Outputs
[xv] http://www.ifla.org/en/publications/functional-requirements-for-bibliographic-records
[xvi] http://lcweb2.loc.gov/mets/Schemas/AMD.xsd
[xvii] http://lcweb2.loc.gov/mets/Schemas/VMD.xsd
[xviii] http://www.pbcore.org/
[xix] http://www.paradigm.ac.uk/index.html
[xx] http://www.loc.gov/standards/mets/presentations/Carl-Wislon_METS%20Presentation.ppt#277,8,WhyUseStandards
[xxi] http://xml.coverpages.org/MPEG21-WG-11-N3971-200103.pdf
[xxii] http://www.loc.gov/standards/premis
[xxiii] http://oaister.umdl.umich.edu/o/oaister/
[xxiv] http://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=jisc-repositories
[xxv] http://jisc.ac.uk/whatwedo/programmes/fair/synthesis/reptypes.aspx
[xxvi] Copac library catalogue gives free access to the merged online catalogues of major University, Specialist, and National Libraries in the UK and Ireland. http://copac.ac.uk/
[xxvii] http://zetoc.mimas.ac.uk/
[xxviii] http://www.suncat.ac.uk/
[xxix] http://www.suncat.ac.uk/support/z-target-auth.shtml
[xxx] http://www.opensearch.org/
[xxxi] http://www.w3schools.com/rss/rss_intro.asp
[xxxii] http://community.jorum.ac.uk/file.php/25/Issues_surrounding_syndicated_feed_into_institutional_repositories_GW.pdf
[xxxiii] http://www.atomenabled.org/
[xxxiv] A full comparison of the technical differences between ATOM and RSS can be viewed at http://www.intertwingly.net/wiki/pie/Rss20AndAtom10Compared.
[xxxv] Swan, A. and Awre, C. (2006) LINKING UK REPOSITORIES
[xxxvi] Personal communication August 2010

Total comments on this page:

Comments are closed.