About the project

About the project

The growing multitude of digital resources forms the basis of the intellectual capital of European research. Mining of further information from these resources and allowing new generations of researchers to “stand on the shoulders of giants” is the very essence of research. These digital resources must persist and remain findable, accessible, and understandable. Data re-use (by users in a different discipline, for example) may happen immediately the data is produced or may not happen for an extended period of time. The same techniques for preservation of data assets support contemporaneous (re-)users as well as the interests of future generations.

Why do we need PARSE.Insight?

There is a very real risk that much of the scientific data and documentation that exists may be lost to future generations unless permanent access is secured. We focus in PARSE.Insight on the infrastructure needed to support persistence and understandability of these key assets over the long term. As noted in the Open Archival Information Systems (OAIS) Reference Model (ISO 14721), when one talks about long term preservation, long term “is long enough to be concerned with the impacts of changing technologies, including support for new media and data formats, or with a changing user community”, which could be just a few years.

The advent of e-Science has deeply modified the research process. The century-old cycle of reading and writing scientific publications as the only medium of scientific exchange has evolved into a multitude of digital resources which form the intellectual capital of European research. These new opportunities are fostering multi-disciplinarity and accelerating the life-cycle of research, enabling the fast re-use of information crucial to scientific investigation. At the same time, while we can [and some branches of mathematics still do] still read articles of centuries ago, most scientific disciplines are effectively risking the entire capital of European research: no coherent or concrete efforts are being made to preserve the digital records of European science. There is a real risk that our scientific records will not be findable, accessible and understandable over the medium and long term, or -in some cases- even the short-term. European science therefore risks of impairing its competitiveness as there might be no proverbial (digital) “shoulders of giants” to stand on.

Aim of project

PARSE.Insight aims to highlight the longevity and vurnerability of digital research data and concentrates on the parts of the e-Science infrastructure needed to support persistence and understandability of the digital assets of EU research.

Description of work

Much work needs to be done and is being done at various levels, but there is no unified roadmap covering the entire range of these actions. We aim to produce such a roadmap, bringing together national, European and global thinking. A detailed inventory of existing and planned preservation support will be gathered. Comparing the roadmap to the inventory will allow us to identify the gaps in the research arena that the Infrastructures programme can help address. However, permanent access raises new challenges which need new ways of analysing potential impact; we will produce requirements for and a tool to support such an impact analysis. Finally we need to ensure that everybody knows where best practices exist, so we propose to establish benchmarks for performance of repositories.

Work flow

The work is divided into a number of phases. Within each phase there are a number of different work packages which are active but one activity is the main focus in that period.

  • The first phase is preparatory, which includes getting appropriate staff in place, producing the draft Roadmap and identifying the targets for the survey and consultations.
  • Next comes the information gathering phase which involves the main portions of the survey and case studies, as well as bringing together ideas about the impact metrics.
  • The third phase is analysis which is a combination of the impact analysis, analysis leading to the refinement of the roadmap and the initial gap analysis.
  • In the fourth phase the main focus is on completing the gap analysis. It also includes the specification of the impact analysis tool.
  • Finally the focus is on testing the impact analysis. In this period the final version of the other reports are produced.

March 2008: Start of PARSE.Insight
March to May 2008: Preparatory
June 2008 to February 2009: Information gathering
December 2008 to May 2009: Analysis
June to August 2009: Gap analysis
September 2009 to February 2010: Impact analysis
End of February 2010: End of PARSE.Insight, publication of the final reports