Isabel has been manager of DIGITAL.CSIC, the Open Access repository of the Spanish National Research Council (CSIC) since 2010. She has a background in library science, international relations, economics and history, currently participates in the EOSC SYNERGY Project, and since March 2020 has been a member of the FAIRsFAIR European Group of FAIR Champions. We interviewed her to understand how CSIC is enabling open, reproducible science across multidisciplinary repositories.
Q1. CSIC has the mandate to organise, preserve, and provide open access to research outputs and has recently released its Open Access Mandate Monitor. How is the Council facilitating Open Science practices within the Spanish research framework?
A. When CSIC issued an institutional open access mandate on April 1, 2019, there was already a solid basis for further developing their commitment to Open Science. After signing the Berlin Declaration in 2006, CSIC kicked off two major initiatives. DIGITAL.CSIC has been collecting and organising institutional research outputs since 2008 and for the same amount of time, the CSIC Publishing Department has been maintaining a growing collection of diamond OA journals, the so-called “Revistas CSIC”.
CSIC’s institutional mandate falls within the category of green OA mandates. As such, researchers must supply open access to peer-reviewed publications and underlying datasets in DIGITAL.CSIC.
While the mandate does not impose any maximum embargo period to make the full text files available as open access, their metadata must be deposited as soon as the works are accepted for publication. The underlying datasets need to comply with the FAIR Principles, and CSIC researchers are required to make them open access unless legitimate reasons such as confidentiality, intellectual property, and/or security reasons are at play.
The mandate highly recommends assigning standard open licenses such as Creative Commons and Open Data Commons to such datasets.
Importantly, the mandate will play a role in the annual institutional evaluation process which assesses scientific productivity in all CSIC centers and institutes. For this reason, CSIC offers researchers a wide range of services to help them comply with it. These include:
- The Delegated Archiving Service – used in 90% of cases and whereby the repository’s Technical Office together with CSIC Libraries Network uploads records into DIGITAL.CSIC on researchers’ behalf.
- The Pasarela – a home-grown infrastructure that connects institutional CRIS conCIENCIA with DIGITAL.CSIC and allows for the bulk retrieval and enrichment of metadata and associated full text files and export into the institutional repository.
- Training workshops – delivered by DIGITAL.CSIC Technical Office to the institutional community.
- Advanced FAIR-compliance features – DOI assignation, support for OAI-DataCite format, SCHOLIX standard enablement.
Preliminary analysis shows that nearly 6,000 journal articles published in 2019 and deposited in DIGITAL.CSIC are available open access, while around 1,000 journal articles are subject to a publisher embargo period, and other 2,400 are under a paywall yet. Also, some 100 research datasets from 2019 are already available in DIGITAL.CSIC, mostly open access.
We expect this positive trend to continue, also because the CSIC Unit of Scientific Information Resources for Research, previously known as CSIC Libraries Network Coordination Unit, manages an institutional fund to support publication in selected Gold Open Access publications and is negotiating Read and Publish agreements with some subscription publishers. More information.
Q2. DIGITAL.CSIC is developing FAIR evaluator software which checks the FAIRness of digital objects via plugins. It leverages PID, licencing, metadata, and relationships and was one of a series of digital object and data repository assessment and certification tools presented to the European Commission on 26 November. What are in your opinion the main benefits and weaknesses of such tools? Can you cite other examples and use cases?
A. Fair Evaluator is intended not only to show how FAIR your data is, but also to guide you as to how it can be improved, Although this tool was originally designed to interact with DIGITAL.CSIC only, it can be generalised to work with any other repository.
With the EOSC SYNERGY project we’ve tested a couple of automated tools to evaluate the FAIRness of our stored datasets. Given our requirement to integrate these tools in a pipeline of automated processes that support quality assessment services for research data and software, open source is a necessity.
These are the highlights from our findings documented in the report EOSC-SYNERGY. D3.3 Intermediate report on technical framework for FAIR principles implementation:
- Tools should serve a practical purpose that goes well beyond mere scoring/validation and offers practical feedback to creators and administrators alike.
- Assessment indicators must be clearly defined since the FAIR criteria used directly affect the results obtained.
- Not all FAIR assessment tools under development are fit for all cases. There is a broad variety of data repository types (e.g. thematic, institutional, publisher-driven, international), and software. The diversity of formats, standards, data and metadata types means it may be relatively easy to have general indicators of the FAIRness of digital objects, but harder to achieve specificity given that some indicators apply at a disciplinary level.
- With regard to valid data repositories, I would highlight COAR concerns around criteria proposed by Data Repository Selection: Criteria that Matter. Those appear to be too narrowly defined in some instances and to exclude important considerations in others. Researchers and funders should decide the best location for data deposit. Today, there is a growing network of institutional and generalist repositories that have enabled all relevant standards and functionalities to manage research data and they do play a significant role in the building of a global system of open research data infrastructures. CSIC has recently joined a related COAR initiative, Input to “Data Repository Selection: Criteria that Matter” – COAR (coar-repositories.org).
I would also mention that having a fully granular/discipline metadata schema in place does not necessarily mean that research data are richly described. A recent DataCite Metadata Working Group analysis examined the richness of the metadata in datasets found in repositories that mint DOIs through DataCite. The analysis found that most optional/recommended metadata elements are not very broadly used by the community in practice. See blog post.
Q3. Open Access and Reproducible Science is at the core of CSIC’s current activity and your work focuses on advancing FAIR data in a multidisciplinary and increasingly interdisciplinary environment. How should the FAIRness and reproducibility of data be addressed when dealing with multidisciplinary repositories?
A. The current situation is complex and multifaceted. Research communities that are very knowledgeable about research data management within an Open Science framework coexist with others just getting to grips with the basics of curating of digital objects. In DIGITALCSIC these disparities can be observed, as richly described collections of datasets live together with other collections that only carry mandatory properties in their metadata.
Further awareness raising and hands-on training are still very much requested by researchers to actively participate in EOSC. There are still important gaps across all researcher communities to fully understand issues such as the management of copyright and licensing selection for example. Also, we witness a rapidly changing universe of Open Science tools that are only very partially known and used by data creators, data curators, and data consumers. As a result, integrating reproducible science practices into our workflows takes some time and is a gradual process. Greater rewards for reproducible research practices would certainly help.
Multidisciplinary repositories need to account for all of this diversity. As a result, it is essential to have at hand easily retrievable, open, up- to- date, reliable and sustainable catalogues of metadata schemes and crosswalks, controlled vocabularies and ontologies as well as good, reproducible research practices combined with inspiring cases of implementation in order to properly serve the specific needs of and support open science activities by researchers in every single discipline.
Isabel has been manager of DIGITAL.CSIC, the Open Access repository of the Spanish National Research Council (CSIC), since 2010. She has a background in library science, international relations, economics and history. Before joining DIGITAL.CSIC, she worked in EIFL knowledge sharing, negotiations with publishers and library consortium building programmes. She also worked in the European Commission on ICT applications in libraries, museums, and archives, and in academic affairs. Isabel’s fields of expertise include Open Science (with a particular focus on repositories, research data management, copyright and other legal issues) as well as access to knowledge in developing and transitioning countries. She currently participates in the EOSC SYNERGY Project.