6.1 What can be learnt from other metadata aggregators?

Cite Permalink:
The respondents worked with a variety of content and metadata aggregations covering geospatial information, bibliographic information, learning materials, as well as those with image, film and sound metadata or metadata plus content. Academics familiar with aggregating metadata also shared thoughts and practical lessons that would be applicable.
Cite Permalink:
A key finding was there are multiple different models for aggregating metadata, and that these different models could be applied to metadata for images and time-based media. Five of these models are summarised as follows:
Cite Permalink:
  • Common (standardised) schema metadata aggregation model: a central aggregator defines a particular metadata schema and each of the collection owners provides metadata in the common schema format.
  • Multiple schemas metadata aggregation model: a central aggregator, but one which does not define the metadata schema; instead each of the collection owners provides metadata in their existing format together with the schema.
  • Federated search model: a central aggregator with a common schema which links in real-time to each of the source collections, which provide metadata to the requesting service (this is not strictly a model of metadata aggregation, since the metadata remains in the collection owners’ databases).
  • Linked data model: no central aggregator and no common metadata schema defined. Instead each of the collection owners provides their metadata in linked data format with a search interface from their own site.
  • Mixed model: a mixture of the multiple schemas aggregation model and the linked data model, with a central aggregator and a common metadata schema guideline, but where each of the collection owners provides their metadata in their existing format together with their schema; the aggregator creates the links between the schemas and metadata using linked data.
Cite Permalink:
See Appendix C – Metadata Aggregation Models for more information on each model, with diagrams and the main advantages and disadvantages of each model, as well as challenges that relate to all of the models.
Cite Permalink:
Several of the aggregators use a common schema metadata aggregation model and a significant amount of time has been spent agreeing these schemas. However, although the standards have been agreed, in some cases based on industry standards, these still need to be checked for ingest as there are differences in interpretation of the rules.
Cite Permalink:
Regardless of which model is used, each aggregator reported that some human intervention is needed during the ingest process. The aggregators endeavour to minimise this effort and noted that when using a protocol such as OAI-PMH to harvest metadata less ongoing intervention was required than using custom exports. However from the experience of those responding, use of OAI-PMH to harvest metadata can still present technical challenges including:
Cite Permalink:
  • Basic errors such as wrong character-set encoding.
  • Servers not being properly OAI-PMH compliant.
  • OAI-PMH client not being able to work with a server, e.g. tokens that should have been unique in every batch were not served uniquely.
Cite Permalink:
Most, if not all, of the metadata aggregators offer something back to the collection owners as contributors to the aggregation. For example, those libraries contributing to SUNCAT may download records in a MARC format, and one image and time-based media aggregator offers their software to contributors to help them organise their own collections (with advantage to the aggregator of a known platform for harvesting).
Cite Permalink:
Some funding bodies mandate deposit of research outputs into a subject or institutional repository as a condition of funding. This has increased the rate of deposit into those services, which act as aggregators for both data and metadata. It is hard to envisage how this model could apply to image and time-based content, other than JISC, as a major funder of digitisation initiatives stipulating that metadata be provided for aggregation as part of realising the RDTF vision.
Cite Permalink:
Collection owners are concerned to maintain their brands. They do not want to see the aggregator’s brand, if it provides its own interface onto the aggregation, take over as the brand that users associate with their collection. Two aggregators of images and time-based media have addressed this in their search service by providing the logo or name of the collection owner as part of the search results for end users. Even if the RDTF aggregation does not have a user interface, it could encourage participation in the aggregation if collection owner logos were available with some requirement that services making use of the metadata display them.

Total comments on this page:

Comments are closed.