BioDATEN at 3rd NFDI Community Workshop at LMU Munich

BioDATEN was present at the 3rd NFDI Community Workshop - Services for Research Data Management in Neuroscience - hosted at the Ludwig Maximilians University in Munich on 10th February. It brought in the base level services provider perspective and its expertise in the NFDI forming process. There were further services providers presenting at the workshop like the FIZ Karlsruhe and infrastructure providers in context of the Human Brain Project. These talks were followed by lightening talks on a range of services, initiatives and data providers like Fenix-RI, GIN, Cellular-resolution brain map, Zeiss research platform, NFDI4BMP, BRIDGE4NFDI, Helmholtz Metadata Services, RDM Services at Marburg University, LMU Munich University Library, Leibniz Rechenzentrum Munich, Gesellschaft für wissenschaftliche Datenverarbeitung Göttingen, Max Planck Computing and Data Facility on the European Open Science Cloud.

The provider's perspective in the NFDI forming process was elaborated by the BioDATEN science data center. Providers of research IT infrastructures are faced by significant technological changes especially fostered by resource virtualization. Many of the modern services and workflows are operated in an increasingly cloud-like fashion where data and computing move from independent, personal workstations to centralized, aggregated resources. Such shared resources allow to host new projects faster. The necessary excess capacity is much easier to maintain and to justify in centralized resources because the necessary shared overhead is typically much less than in independent systems. Grant providers start to understand the changed technological landscape and start to adopt their funding schemes allowing to buy-in into existing resources preferred to establish single ones per new project. Users are faced with a difficult sizing challenge as it is often impossible for them to define the “right” configuration for a required resource. These challenges are answered by the IT industry as well as by science driven cloud and aggregated HPC offerings. The aggregation of resources into larger infrastructure allows to focus on the increasing efforts in market analysis, system selection, proper procurement and operation of (large scale) IT infrastructures, which can be done by a few experts. Further on, such a strategy would eliminate the contradiction of typical project run times versus the (significant) delay for equipment provisioning and the usual write down time spans of that equipment.

The massive changes in the IT landscape and user expectations increase the pressure for re-orientation of university (scientific) compute centers. Cooperation in scientific cloud infrastructure is the chance for many compute centers to significantly widen their scope of IT services. It helps to keep up with the demand by the scientific communities and to offer a quite complete service portfolio. Organizationally, it allows for specialization and community focus. When defining future strategies and operation models, compute centers might find a new standing in supporting research data management by providing efficient infrastructure and consultation to the various scientific communities. Furthermore, it offers them the opportunity to participate in infrastructure calls. These developments enable researchers to offload non-domain specific tasks and services on to infrastructure provides. Suitable governance structures are to be implemented to ensure a persistent relevance of computer centers through user participation and feedback loops in the future. Close cooperation and consultation (like already done in Freiburg for the bwForCluster NEMO and for the storage infrastructure bwSFS) helps all stakeholders to have suitable and up to date infrastructures tailored to their needs. Such structures are in their infancy for the NFDI, but future NFDI wide coordination should advance this topic.

The financing of IT infrastructures for the various scientific communities is often grant driven and inherently not suitable for providing sustainable long-term services and research data management. The future should see a changed flow of funding from a simple project-driven and organization centered practice to demand-driven streams to different infrastructure and service providers. Large infrastructure initiatives like de.NBI or the NFDI need not only to solve the role of personnel employment (permanent vs. project based) but also to define suitable business and operation models compatible with the VAT regulations and the federal and state requirements for cash flows in mixed consortia.

nestor Access Workshop in Fulda

As a member institution of nestor the University of Freiburg participated in behalf of BioDATEN in the nestor-internal two days "Workshop on Access". The workshop intends to help to foster the qualification of partners and clarify questions on digital preservation infrastructures and access systems. Use cases and significant aspects of the user perspectives are discussed following the topics of architecture and conceptional design, aspects of the use of "virtual reading rooms", support and consultation needs on historical digital materials, next generation use like big data and data mining scenarios. Further on the OAIS offers a conceptional framework to receive and store objects differently to presenting them e.g. driven by retention periods. Standardization beside legal aspects are topics to be taken into account as well.

Presentations to kick-start discussions and exchange were given by nestor partners like from the Leibniz Institute for German Language, the German Central Library for Medicine and the State Archive of Northrheine-Westfalia. Invited talks were delivered by Thomas Ledoux from the Bibliothèque Nationale die France: "An experienced practitioner’s view on Access (library perspective)" and Nicola Wissbrock and Sarra Hamdi, The National Archives UK: "An experienced practitioner’s view on Access (archival perspective)". The second day handled topics like "Access Rights Information in the SLUB digital long-term archive" presented by the Saxonian State and University Library Dresden (SLUB), which gave a couple of recommendations on how to pragmatically deal with copyright, usage and access statements for digital objects. The following talks covered "What is new? Changes in OAIS relating to Access" by Fernuniversität Hagen, "Chances and risks of dark archives" by the Technische Informationsbibliothek. The day got conluded by a team member of the CiTAR and BioDATEN long-term access team from the University of Freiburg on "Emulation as a Service". A long-term access module will get included as a module in the BioDATEN science gateway.

DataPLANT NFDI - a further step mastered and work ahead

BioDATEN and DAplus+ are jointly involved in the nationwide NFDI process with DataPLANT Consortium (in the area of Fundamental Plant Research). End of January the review of the DFG assessment of the of the application and the oral presentation in Bonn (Link to the older message) in early December arrived. The reviewers gave a widely positive feedback, from which we deduce that DataPLANT is still in the group for consideration of funding. We would like to thank our collaborators from the Galaxy team, especially Anika Erxleben and from the Technical University in Kaiserslautern and Jülich Research Center. The feedback addresses various areas of the proposal, such as "Maturity and relevance", "Research data management" and "Internal structure and sustainability". Besides very positive remarks there are also some critical passages that should be considered in the answer.

Some snippets from the review: "The consortium is thematically closely focused on fundamental plant science research. This strong focus and the fact that, in addition to model plants, some cultivated plant species are also taken into account is seen as a clear strength of this consortium" ... "The planned implementation of processes for metadata validation seems to make sense and the claim to improve the quality of existing metadata instead of discarding it is commendable. The assessment of the quality of raw data is a interesting additional approach, which, in view of the interpretation of the results by the users carefully must be developed. The quality of the already implemented and the level of measures planned beyond this is very high. One of the planned measures is the establishment of 'data stewards' as flexible on-site assistants to be deployed, who are very demanding but also ... is very positive." ... "The consortium's efficient internal structure is impressive, consisting of the various stakeholder groups and the clearly defined bodies, including the 'Data stewards' and 'Data champions'. The topic 'efficiency and sustainability' is well received in the application addressed."

The "Diversity" of the consortium lead, which was quite one-sided in the current application, needs to be improved in the further process. The consortium is now requested to respond to individual points in a three-page statement in reply to the reviewers remarks. In particular, the focus of the answer will lie on points such as "How are the data stewards recruited, trained and meaningfully distributed institutionally and made available to the participants" or "Possibilities of generalising the workflow approach". The improvement of "diversity" should in particular expressed in a broader governance structure to be formulated jointly with the consortium. This will also incorporate initial experiences from the BioDATEN science data center.

Sponsored by: