bwSFS - Storage for Science and services become available soon

The storage system bwSFS (Storage-for-Science) forms the geo-redundant distributed technical backbone for basic storage services, research data management and sharing of data for the BinAC and NEMO scientific communities. It contributes to the storage infrastructure and the planned data federation within the state of Baden-Wuerttemberg. The central storage components of bwSFS are located at the Tübingen and Freiburg computer centers. The total capacity of combined object storage and filesystem will be roughly 20 PByte in its initial configuration. To support research data management, the use of InvenioRDM, which already includes a convenient user interface and the OAI-PMH interface, is chosen within bwSFS. All central institutions and projects involved in the RDM process were included in this decision at an early stage. In Freiburg, a Gitlab for versioning, collaboration and sharing of data of ongoing projects will be used. Established services of the university libraries will be deployed for DOI allocation in Invenio, and ORCID will be included for persistent identification of researchers. In this way, resources for RDM will be pooled in order to improve support for the disciplines in the implementation of specific RDM requirements and to ensure better advice for researchers. To enforce the FAIR and Open Access principles, DMPs will become the base to support standardization with guidelines for metadata management, archiving and licensing models. In order to reasonably manage the intended broad user base of the system - besides BioDATEN, the local communities of the participating universities are served as well - and to achieve a seamless integration into the envisioned BioDATEN services, a federated management of project, user and group data is required. Starting in the implementation phase of the software and services, which involves various bioinformatics groups, it becomes apparent that the existing methods for identity management are not sufficient. Compared to HPC services, storage services require a much deeper integration of existing infrastructures and a more flexible user management, which should also include persistent identity systems like ORCID (

Organizational challenges - discussing a code of conduct for best practices in RDM

Organizational challenges - discussing a code of conduct for best practices in RDM The long-term nature of research data poses new challenges which needs to be addressed on the various institutional layers. The BioDATEN project supports the debate on and creation of suitable policies within the partners research institutions. The various stakeholders involved in research and data management are highly dynamic regarding their workplaces throughout their career. They might switch organizations multiple times before acquiring some permanent post or moving on. The people who created, collected, and processed data may have long since left the research institution. Projects may also have been officially finished a long time ago. Many projects also face questions about copyright, handling of personal data, and appropriate licenses for research data and related proprietary software. Responsibility over the data (and software) must therefore be exercised in a regulated manner at any point in the data life cycle by appropriate entities. During ongoing research compliant storage and processing in the context of the respective project needs to be considered. This must also be ensured later in the specific repositories of the BioDATEN communities, university publication services or dark archives. The potentially eternal storage of data after project completion is accompanied by responsibilities whose assignment must be clarified. Part of the handling can be derived from data management plans, other parts could be defined in the respective research data policies of the scientific institution. All considerations here are based on the DFG's "Guidelines for Safeguarding Good Scientific Practice". BioDATEN created in coordination with the Science Data Center steering group and the research data management working group guidelines for the responsible handling of research data (Leitfaden). This document got approved by the ALWR-BW and will be discussed in the libraries directors working group as well.

Paper on ORCID and ROR ID recommendation published

In a significantly networked and highly collaborative scientific field such as bioinformatics, the goal is to jointly use and federate services for data management. With the goal of a well acknowledged data publication in mind a persistent link between research objects and persons need to be established and maintained. Of central importance for this objective are persistent identifiers of researchers. Because of the high turnover within this group of individuals, agreement among all stakeholders in the science enterprise on a uniform, internationally recognized, and institution-wide system would be a considerable advantage, since changing workplaces between institutions would no longer require changes in the database. Such an internationally recognized identifier should be stable and unique for individuals. In a further step, BioDATEN has urged the SDC and Baden-Wuerttemberg RDM working groups to join the Memorandum of Understanding (MoU) of DINI and made a recommendation for ORCID. In this course it is advised to identify persons in research information systems, repositories and research data management via the ORCID ID and repositories via re3data (see the article at "Bausteine FDM"). The ORCID ID already has a high degree of visibility and acceptance in the bioinformatics community. The recommendation was favorably received by both the ALWR-BW and the Working Group of Library Directors in the state.

Sponsored by:

  • Ministerium für Wissenschaft, Forschung und Kunst Baden-Württemberg
