A recent study found that 80% of data is lost forever twenty years after the results are published in a journal (Vines et al. 2013). Well-curated datasets, however, are treasures and can lead to innovative research even decades after being recorded. To ensure that data remain relevant even beyond the primary purpose of the original study, best practices recommend the utilization of the FAIR principles (data should be Findable, Accessible, Interoperable and Reusable). The need for data curation has been recognized by many funding agencies (e.g. EU, DFG, BMBF, NSF, etc.). Grant proposals submitted to these agencies must also include data management plans and provide open access to data. The GFZ  has published guidelines for handling research data in 2016. Although some communities at GFZ are already handling their data management according to the FAIR principles and the guidelines, the implementation of these guidelines remains a challenge in many research groups at GFZ and beyond. Project Geo Data Node, funded by the German Ministry of Education and Research (BMBF), seeks to address these challenges and foster the application of GFZ research data guidelines over the next two years. Through the exchange with similar initiatives, the project will reach out beyond the GFZ.

The project consists of the synergy with communities with advanced data management practices and the Library and Information Services. GEOFON and the Geophysical Instrument Pool Potsdam (GIPP) employ highly standardized data and serve as data archives. Lessons learned by this community are to be promulgated at GFZ. An update to the guidelines for communities where data management is already quite standardized and successful (e.g. Seismology, Geodesy, Rock Samples) is also planned. The project will advance in parallel on two main streams: the first work package will focus on Case Studies for the different thematic areas including communities dealing with "long tail" data and communities with high degree of standardization; the second work package will focus on developing standard data management plans for research data, coordination among the different disciplinary area at the GFZ and organize outreach events to ensure coordination with the communities at national and international levels.

Library and Information Services

The Library and Information Services (LIS) host  GFZ Data Services - a domain data repository - and offers support to scientists seeking to publish their data. Within the Geo Node Project,

  • LIS will foster data publication for all research fields in GFZ.
  • Furthermore  the LIS will extend usage of other persistent identifiers like the International Geo Sample Number (IGSN).
  • We will  seek to transfer the data management expertise of GEOFON and GIPP, initially at GFZ, but also externally.
  • Templates for data management plans will be evaluated and tailored to the needs of the geoscientific community.
  • Project lead

GEOFON

Specifically the GEOFON team will contribute to the following tasks:

  • Develop Data Management Templates for seismic data sets in the archive encompassing the different use cases (Permanent networks: own, GFZ and third party; temporary networks: instruments from GIPP, own instruments within GFZ and third party; different data access rules: open, embargoed, restricted)
  • Develop Scientific Technical Report templates to complement the data sets in archive
  • Review relations between existing Persistent Identifiers for a better discovery of all resources related to a data set.
  • Support GIPP in attributing Persistent Identifier and jointly work to an API aiming at facilitating seismic metadata creation.

GIPP

The "Geophysical Instrument Pool Potsdam (GIPP)" provides seismic and electromagnetic instruments and sensors for joint projects with universities and other research facilities. As part of the instrument loan, users commit to providing a copy of any measured data to the GIPP. These raw data and associated metadata are stored as assembled datasets in a data archive operated and managed by the GIPP. In addition, the assembled datasets are provided a persistent identifier (DOI), enabling citations as data publications, and making them accessible.

Within this project, the GIPP seeks to:

  • develop Data Management Templates for the different components of the GIPP Archive
  • integrate datasets into the most uniform format possible
  • develop templates for Scientific Technical Reports to facilitate the archiving and publication of datasets acquired with GIPP instruments, as well as for generating or extracting the metadata
  • support GEOFON by e.g. real time instrument calibration of the equipment used to generate the GEOFON data (via Persistent Identifiers for the instruments)
  • collaborate with the library to facilitate the publication of as many datasets of the GIPP Archive as possible