Preservation

The basic steps in preservation to counter changes are:

  • create adequate Representation Information for the Designated Community and/or
  • transform to another format if necessary or
  • if preservation cannot be carried on by the current organisation then hand over to the next organisation in the chain of preservation

The mantra is therefore “collect Representation Information, transform or hand on to the next in the chain of preservation” rather than “emulate or migrate”.

Evidence about the authenticity of the digital objects must also be maintained, especially when the objects are transformed or handed over (see below).

Confirmation of the quality of preservation can come from an Audit (with possible certification)

Asset base

Issue WP/Project/Tools/Services Asset Evidence

Definition of Designated Community

APARSEN WP25

Deliverable

D25.1 Interoperability objectives and approaches

In D25.1 some possible solutions to fill these gaps (below) have been proposed. An example in the domain of preservation metadata is represented by the broad adoption of international standards and abandoning local solutions and ad-hoc metadata schema. The most promising standard we have identified is PREMIS Data Dictionary for Preservation Metadata.

http://www.loc.gov/standards/premis/

Several sources of evidence of the effectiveness of the implementation of the PREMIS data dictionary have been presented at the iPRES2013 workshop titled “PREMIS Implementation Fair 2013”.

http://www.loc.gov/standards/premis/premis-implementation-fair-agenda-2013.html

Apart from the aforementioned paper, a tool based on this approach (called RIMQA) has been implemented and experiments are reported in that paper.

SCIDIP-ES GIS

Software to help define the Designated Community (DC) and implications of changes to the DC.

Perform preservation actions

Preservica Preservation workflows

Cloud based preservation actions

Evaluate Preservation capability

[Download not found]

Process for evaluating the capability of disparate systems to perform preservation actions on a wide and diverse set of digital object types.

D14.1 & Evaluation spreadsheet

Creation of RepInfo

APARSEN WP14

SCIDIP-ES RepInfo Toolkit, Preservation Strategy Toolkit, Registry, Gap Identification service

Deliverable D14.1 Report on testing environments

CASPAR evidence

SCIDIP-ES software and User feedback

Emulation

KEEP (emulation software)

ENSURE (Virtual machines)

Software (check licences)

Transformation

OPF related

SCAPE

Various e-science projects

Details of software

Handover

SCIDIP-ES Brokerage/Orchestration service

Examples of hand-over

Audit

APARSEN WP33

Spreadsheet to capture evidence about quality of preservation

SCIDIP-ES certification toolkit

Tool to perform self-evaluation

Selection of interoperability approaches and solutions which can have impact on preservation activities

APARSEN WP25

D25.1 Interoperability objectives and approaches

in particular the matrix of interoperability solutions, gap analysis and recommendations

How to curate the specificity of the various ontology-based metadata, while the ontologies evolve (this is important for e-Science)

APARSEN WP14

Experience in both theory and its applicability (including tools) A paper that describes the approach:

Tzitzikas, M. Kampouraki, A. Anastasia,

Curating the Specificity of Ontological Descriptions under Ontology Evolution,

Journal on Data Semantics, (accepted for publication in 2013).

Interoperability

See section about Usability (WP25)

Gaps

Interoperability. Several interoperability gaps have been identified and classified in D25.1. In particular the following domains have been investigated: 1) Identification systems (for digital objects, authors and datasets) 2) Library classification systems 3) Library Linked Data 4) Metadata 5) Ontologies and Vocabularies 6) Data Provenance 7) Preservation tools 8) Exchange standards 9) Preservation Frameworks 10) Semantic annotation services 11) e-Science infrastructures.Some of the identified gaps can have strong impact on preservation strategies and activities. Some examples follow.

  1. Lack of cross-organization coordination in the definition of metadata preservation schema to capture, maintain and share information about provenance, authenticity, preservation activity, technical environment, rights management and so on. This has led to the development of a set of metadata reflecting the particular needs and requirements of the specific community that authored them.
  2. Lack of a scalable infrastructure for the efficient planning and application of preservation strategies for large and heterogeneous data collections.
  3. Many different suites and preservation tools are in use in different communities (e.g. iRODS, LOCKSS). The isolation from each other represents an obstacle for inter-institutional preservation and interoperability.
Although critical for e-science, the community is not aware about the loss of specificity that happens when world models (ontologies, taxonomies, thesauri, controlled vocabularies) evolve over time.