WP27: Scalability

Objectives

There are many aspects to the scalability of preservation systems. Scalability needs to address:

  • Total capability (in TB)
  • Number of digital objects and size of each objects (e.g. video objects or small documents)
  • Distribution – how geographically dispersed is the system
  • Degree of sharing, namely at what level does it support multiple curators and multiple users, and concurrency requirement
  • Security in a multi-tenant environment which hosts data shared by different curators
  • Availability – are objects expected to be available at any time from anywhere?

The objective of this workpackage is to understand what the important scalability parameters are in preservation systems. Of a particular importance is developing preservation support services which can be shared by many data curators, that can lead to a reduced cost infrastructure.

Description of work and role of partners

Task 2710 Scalability of services

This task will review of the scalability of storage and other techniques used and needed by partners . This will be contrasted with the extremely scalable solutions that exist today in the form of Cloud Storage providers. A starting point can be Tessella’s work on measuring the scalability of the SDB solution and NARA’s ERA solutions. The scalability challenges identified in the Warwick Workshop report should also be addressed, for example dealing with hundred of billions of objects and objects of many petabytes.

Task 2720 Recommendations about scalability

Evaluation and recommendations. The task will identify the foremost important scalability parameters and dimensions that are needed for the partner’s preservation systems. It will analyze implications on issues such as security and cost required to attain such levels of scalability. Specific recommendations will be given, with clear guidelines for how tools and techniques can be incorporated into real environments and, in particular the testing environments identified in workpackage 1400. Some of these tools can include external storage.