WP14: Common Testing Environments

Objectives

Collect together a set of environments to test the efficacy of tools (PreservationTools) and techniques (PreservationTechniques) for digital preservation, against changes in hardware, software, environment and knowledgebase of the Designated Communities, and design new ones if necessary.

Description of work and role of partners

It has been said that it is easy to make claims about digital preservation but very hard to provide evidence about any specific tools and techniques. It is probably reasonable to expect that each proposed preservation technique works well against certain types of digital objects or certain challenges; it is unlikely that there is a universal test. Besides preservation efficacy one also needs to test against portability, interoperability, robustness and scalability.

We need testbed techniques which can tell us whether a proposed preservation technique works, and the zone of effectiveness with respect to type of objects.

The CASPAR project adopted what it called accelerated lifetime testing by simulating changes in hardware, software, environment and the knowledge base of designated communities; using CASPAR techniques in well defined scenarios using many types of data from many disciplines it was claimed that this was solid evidence for the efficacy of the proposed CASPAR solution. Note that the CASPAR testbed is not a piece of software but rather a general approach within which other specific pieces of software can be tested.

Other projects have proposed different test beds, for example the Vienna test bed and that the PLANETS test bed which are closely related, is a piece of software which gives prominence to significant properties. The SHAMAN Integration Subprojects (ISPs) are testbeds set out to embed preservation features into production and reuse environments. In addition, commercial companies like Tessella have systems (in their case, SDB) that can operate at scale (e.g., migrating complex logical objects consisting of hundreds of thousands of files) and into which new tools can be plugged. This allows their customers to test tools and techniques on content held within their repositories than might not be able to be sent to an external testbed (e.g., for size or security reasons). At the level of bit preservation there are numerous digest techniques, and indeed digests of digests such as ACE [11].

In this work package we will look across partners and beyond to identify candidate testing techniques. These will be classified and themselves tested against various types of data and scenarios; a number of open competitions will be organised to encourage a competitive spirit. Ultimately we aim to produce a collection of testbeds which will include testing procedures and test data, together with test software if appropriate, which can be used to provide a common measure for digital preservation techniques. We recognise of course that this collection of tests will not be perfect but we believe it will be possible to provide a benchmark.

At the very least the testing should provide evidence about effectiveness of the tools against changes over time in hardware, software, environment and changes in the knowledgebase of the Designated Communities.

Task 1410: Identification of testbed techniques and tools

This task collects together the various testbeds which are available. To help document the work in this task, the Wiki will be used to maintain our list of IdentifiedTestbeds.

Task 1420: Testbed suite

This task produces a TestbedSuite with associated TestbedProcedures. To facilitate this partners will make their testbeds, procedures, test data and software available to other partners.