WP4 - Data Infrastructures

Wiki pages

Objectives

  • Extend the CHAIN Knowledge base with Data Infrastructures: collecting issues, best practices & discovering data repositories that could be of interest for VRCs.
  • Support the study of data infrastructures for a few VRCs (e.g. Climate Change, HEP-SuperB, Genomics, etc.)
  • Promote trust building towards open scientific data infrastructures across the world regions, including organisational, operational and technical aspects with a strong liaison with WP3 and WP5 Activities.
  • Study the opportunities of data sharing across different eInfrastructures and continents widening the scope of the existing CHAIN Knowledge Base to Data Infrastructures.
  • Provide proof-of principle use-cases for Data sharing across the continents.

 

Description and Tasks

 

Task 4.1 – Information gathering on Data Infrastructures and relevant Data Repositories - Leader: CIEMAT


The task will perform an investigation on the data used by existing and potential trans-continental communities, i.e. VRC, which have been previously identified in the CHAIN project and also new ones that could have emerged lately. This is of outmost importance since it will allow the different communities to avoid duplicated efforts and profit from other regions’ developments and data. In addition, this harmonisation will not be restricted to the regions targeted in this proposal, but also done in cooperation with European initiatives, so the European advances could have a higher impact by the definition and proposal of data standards worldwide. The collected data will be structured and the existing CHAIN Knowledge Base will be re-organised to present this new information in an interactive graphical and tabular format.

 

Task 4.2 – Analysis of existing and proposed Data Infrastructures - Leader: INFN


An analysis on the commonalities and differences of the available data bases/ repositories (format, range, granularity, etc.), their future challenges and the mostly related executed applications will be performed. The analysis will take into account the actions performed in WP3 in order to count on as many computational platforms as possible and will be based on:
the existing plans of relevant organisations, initiatives (e.g. identified VRCs, projects like EGI, PRACE, StratusLab) and/or committees (e.g. ESFRI), etc.; the preliminary results of the Task 4.1; the feedbacks received during the thematic workshops and high-level conferences (together with WP2).

 

Task 4.3 – Use cases - Leader: CIEMAT


For the results gathered in Task 4.2, two use cases will be proposed. These use cases are expected to be steps forward in order to achieve a common format for the stored data by means of software developments that could make all of them compatible and ‘interoperable’, this is, data stored in different formats could become valid for either a specific application or for several of them. The different actions developed in WP5 will be incorporated in order to enhance the impact of the use cases. The task will cooperate also with WP2 to make appropriate dissemination use of these activities and possibly outreach to other scientific groups and VRCs.