Proceedings of the 2021 E-Science-Days with BioDATEN contribution available

Research data management (RDM) and open data is an important topic will receive further attention in the time to come. One goal of BioDATEN is to be part of RDM related processes and to shape RDM for the bioinformatic community. The recently published proceedings of the E-Science-Days 2021 give the RDM community an overview of current trends and include a BioDATEN contribution about science gateways.

 Research data management is not just “Upload your data to this repository” or “Use CC by for your data” but a rather a whole bunch of topics. Events like the E-Science-Days help the RDM community to get an overview of current approaches, topics, and developments as well as to network with other community members. The E-Science-Days 2021 proceedings cover the complete range of topics such as software to build a data repository, new training approaches, legal aspects of data sharing, infrastructure, and the NFDI consortia. Next to BioDATEN, the Science Data Centres BERD, MoMaF, and SDC4Lit submitted contributions, too. Also, the (award-winning) BioDATEN work on a science gateway got published as well as an overview of bwSFS – whose infrastructure is also used by BioDATEN – and some papers covering the NFDI4Plants approach.


Working on organizational aspects to make InvenioRDM productive

In the early stages of the SDC BioDATEN, various software frameworks for data publication has been reviewed and a list of selection criteria has been developed with the other three SDCs in Baden-Württemberg. The review process resulted in selecting InvenioRDM as the framework to be used for BioDATEN as well as for the NFDI DataPLANT.

InvenioRDM is still evolving and progressed quite nicely over the last couple of months. There has been further development in the framework and the University of Freiburg is now an official partner in the InvenioRDM community like the University of Tübingen.

To make data citeable, persistent identifiers, such as DOIs, are necessary to provide long-term access to research data. To push and facilitate the DOI allocation for the BioDATEN community, a good and fruitful exchange with the University Library Freiburg has been started to benefit their experience in handling DOIs via DataCite (TIB/Hanover).

DOIs allow for prefixes that, instead of being just a numerical ID, deliver semantic content such as the issuing institution. To facilitate DOI transfer from one institition or publication systems to another institution or system, e.g. to ensure long-term access, we opted for bare numerical IDs. This approach prevents possible irritations potentially caused by a DOI prefix issued at Tübingen but now resolving to Freiburg.

At the moment, formal agreements are being prepared, based on the Freiburg University's new research data policy, which also promises long-term availability of the assigned DOIs. The eScience team in Tübingen, for example, has already started thinking about this. One option, for example, would be a statically generated HTML page with the basic requirements of DataCite if the original service can no longer be continued. The sustainability has to be agreed between the involved institutions like computer center, university library etc.. This is happening at the moment in Freiburg through various discussions involving the relevant stakeholders.

Additionally, we are collecting various action items to increase the usability for users such as:

  • Communities for data publication and classification
  • Workflows for quality assurance and final approval 
  • Formal agreement for data publication

The end-user agreement has to be be setup with the community to reflect their needs. The bare minimum requirement for DOI assignments are descriptive metadata according to the DataCite specifications. In this context, using ORCIDs as persistent identifier for individuals is seen a fundamental concept. 

Descriptive metadata mainly focus on the "Who" of research data, while scientific metadata focus on the "What and How" of research data. How these scientific metadata will be stored and how research data will be packaged, e.g. by containers such as ZIP, BagIT or OCFL, needs to be clarified in further discussions.  

DFG sharpens the requirements for handling research data and makes statements about it mandatory

The DFG sharpens and refines the requirements for handling research data and makes statements about it mandatory for project proposals.

The DFG states in it's latest announcement that statements about the handling of research data are mandatory in project proposals and highlights the relevance of of good project planning and the inclusion of specialised skills and structures. The DFG worked together with the scientific communities on requirements that have been sharpened,  refined and are now part of project proposals. Statements should be guided by a collections of questions. Further, statements about research data management will get more attention in the review process. 

The DFG supports the proposal writing process by offering support and is in contact with supporting structures in research organisations. The DFG also provides special funding opportunities for building and expanding research data infrastructures and methodologies. Also, costs for data processing with the aim of re-usability can be included in project proposals. 

The DFG also encourages applicants to include research output regarding re-use of research data or research data handling in their CVs. 

