Open research data: Dat@UBFC
The dat@UBFC project proposes to set up a service for the management and valorisation of research data in Bourgogne-Franche-Comté. In the current international context of data opening in research, this service will aid researchers in the management of their data, facilitate data sharing with the main data services via an interoperable SEO portal, increase the visibility and efficiency of local research, and keep up the regional digital scientific heritage.
New technologies and new computer appliances allow for the production of data, whether observation data or digital simulation results. These data are produced in ever increasing amounts. Big data include methods for collecting and treating large volumes of data. However, for these data to offer the best of their potential, it is essential for them to be protected, validated and shared. These last years, the emergence of the “open data” movement has favoured data opening to aim at their optimal reuse. Public data (data collected and funded by public organisations) are seen as a common good, a digital heritage whose dissemination is of public and general interest. Scientific research data are among this heritage. Thus, the “open science” movement was born, with different international projects and portals.
For a structure like the COMUE UBFC, proposing a joint support service to the scientific community to promote good management and valorisation of research data is currently a major stake, and is part of the services to be rationalised and pooled at the regional scale.
The dat@UBFC project proposes to set up such a novel service at the regional scale, based on two existing projects: the OSU THETA dat@OSU project (cf. box below) to create a portal for research data valorisation, and the UBFC regional datacentre project for data treatment and storage.
FAIR – Findability, Accessibility, Interoperability and Reusability
Since 2016, the implementation of FAIR principles, under the agreement of the international community of scientific researchers, has promoted the findability, accessibility, interoperability and reusability of research data by humans and machines. Obtaining FAIR data is made easier by data management plans (DMPs). These documents are written at the beginning of research projects and define what researchers will do with their data during and after the project, and more particularly specify data availability. They have been compulsory for European funding requests since 2017, and for ANR funding requests since 2019.
A support-to-research service
The objective of this support-to-research service is to guide researchers as early as data are produced so as to:
- manage their data throughout their life cycle,
- facilitate data sharing,
- facilitate data reproduction and checking (scientific integrity),
- valorise research (when data and publications are cited, results are more visible),
- save data and ensure their durability.
It more particularly makes it possible to:
- meet requirements from institutions and financers
- offer better-quality research and new scientific perspectives
- increase the visibility and efficiency of local research (economy of means, acceleration of advances in research)
- keep up the regional heritage in digital scientific data.
Launching of dat@UBFC
The Dat@BFC2 colloquium took place 20th – 22nd November 2019, in Dijon (Montmuzard campus)
The DataBFC2 colloquium “Open science: initiatives and projects in Bourgogne – Franche-Comté within the new European and international context of research data” was aimed at providing an overview of the regional initiatives for the management, the open access and the valorisation of research data. The dat@UBFC project was officially launched on this occasion
The proposed services will encompass all the stages of the life cycle of data, i.e.,
- planning: as soon as a research project is launched, the dat@UBFC team can assist researchers in the writing of a data management plan (DMP).
- collection and analysis: these two steps can be achieved thanks to the pooled computing appliances made available by the scientific hub of the datacentre (computing servers, specific data treatment tools, simulators…).
- documentation: the service will assist researchers in the description of their data. This description will be done via the dat@UBFC portal.
- storage: the infrastructure of the regional datacentre will save the data described in the portal.
- conservation: durable archiving of certain data will be carried out or organised by the datacentre.
- exposure: the dat@UBFC portal will allow searching for and viewing research data from the whole COMUE, and re-exposing these data through other portals with which it will be interoperable (DataCite, Isidore, etc.).
- reusability: good management practices of research data as soon as a project starts or as data are created will facilitate data reuse.
The infrastructure relies on:
- a portal that will allow researchers not only to describe their data and make them available in open access, but also to save them.
- a digital environment proposing tools or services necessary for good data management, such as:
- simulation and/or computing tools to generate or treat the researchers’ data
- the availability of services like DOI allocation, data management plans (OPIDoR DMP) and help in the choice of durable formats (FACILE from the CINES)
- the durable saving of data thanks to the setting up of pooled storage services made available to researchers.
- training sessions, user guides and tutorials, to describe good data management practices and guide researchers about specific elements.
The portal will be nurtured by all UBFC researchers, who will provide highly diverse data. To guarantee the quality, the homogeneity and the completeness of the data descriptions provided by researchers, it will be essential that all these data be validated by a data librarian before becoming visible on the portal. A team of librarian engineers specialised in scientific data treatment and of computer science engineers will offer researchers support and advice via individual or collective meetings, or training sessions (within the framework of doctoral schools to raise future researchers’ awareness and incite them to describe their data, and within the framework of continuing education). The team will also propose desk tools and legal advice, and will help researchers manage their data throughout their life cycle.
An existing base
The dat@OSU portal has been operational since April 2016. It will be reused to design the dat@UBFC portal. Its IT structure (data coding and model) will be used to define the “foundations” of a portal reusable at the UBFC scale.
The portal “foundations” produced from dat@OSU will evolve to meet the needs at the UBFC scale. It will be interoperable with other national and international portals, in the same way as the dat@OSU portal.
As for desk tools, they will be supported by the regional datacentre, which will pool and make available tools for data generation, treatment, storage, and archiving. Once developed and deployed, the solution will be hosted by the regional datacentre.
The dat@OSU project is carried by the OSU THETA. It describes the research data from its federated research teams and laboratories, based on the standards used by the scientific community. It was launched in April 2016. Numerous descriptive sheets of datasets and databases are available, concerning research in the fields of medicine, molecular spectroscopy, ecology, biodiversity, astronomy, chemistry, archeology, climatology…. This portal is equipped with a highly sophisticated search engine that allows searching for and viewing data description sheets via a web interface open to all (researchers, students, decision-makers, financers, the general public, etc.), with results geolocalised on a world map. Interoperability is operational with a certain number of other platforms.
Opening of the Sandbox gate
This portal should enable researchers from all disciplines to test the tool and to feed back the needs that will enable the dat@UBFC portal to evolve.
Researchers who wish to participate in the project can visit the portal and/or contact the project team: firstname.lastname@example.org.
The project team:
Sylvie DAMY, porteur du projet (Chrono-environnement, UFC)
Vincent BOUDON (ICB – CNRS)
Françoise CHAMBEFORT (SCD UFC)
Bernard DEBRAY (UTINAM – CNRS)
Jocelyn LEVREY (DSI – UBFC)
Raphaël MELIOR (OSU THETA – CNRS)
Julien PERGAUD (Biogéosciences – CNRS)
Benjamin POHL (Biogéosciences – CNRS)
Francis RAOUL (Chrono-environnement – UFC)
Didier REBEIX (DNUM – UB)
Hélène TISSERAND (OSU THETA – CNRS)