Blog

Metadata and integrity: the unlikely bedfellows of scholarly research

I was invited recently to present parliamentary evidence to the House of Commons Science and Technology Select Committee on the subject of Research Integrity. For those not familiar with the arcane workings of the British Parliamentary system, a Select Committee is essentially the place where governments, and government bodies, are held to account. So it was refreshing to be invited to a hearing that wasn’t about Brexit.

The interest of the British Parliament in the integrity of scientific research confirms just how far science’s ongoing “reproducibility crisis” has reached. The fact that a large proportion of the published literature cannot be reproduced is clearly problematic, and this call to action from MPs is very welcome. And why would the government not be interested? At stake is the process of how new knowledge is created, and how reliable that purported knowledge is.

The research nexus - better research through better metadata

Researchers are adopting new tools that create consistency and shareability in their experimental methods. Increasingly, these are viewed as key components in driving reproducibility and replicability. They provide transparency in reporting key methodological and analytical information. They are also used for sharing the artifacts which make up a processing trail for the results: data, material, analytical code, and related software on which the conclusions of the paper rely. Where expert feedback was also shared, such reviews further enrich this record. We capture these ideas and build on the notion of the “article nexus” blogpost with a new variation: “the research nexus.”

Data citations and the eLife story so far

When we set up the eLife journal in 2012, we knew datasets were an important component of research content and decided to give them prominence in a section entitled ‘Major datasets’ (see images below). Within this section, major previously published and generated datasets are listed. We also strongly encourage data citations in the reference list.

How do you deposit data citations?

An exemplary image

Please visit Crossref’s official Data & Software Citations Deposit Guide for deposit details.

Very carefully, one at a time? However you wish.

Last year, we introduced linking publication metadata to associated data and software when registering publisher content with Crossref Linking Publications to Data and Software. This blog post follows the “whats” and “whys” with the all-important “how(s)” for depositing data and software citations. We have made the process simple and fairly straightforward: publishers deposit data & software links by adding them directly into the standard metadata deposit via relation type and/or references. This is part of the **existing Content Registration ** process and requires no new workflows.

Linking Publications to Data and Software

TL;DR Crossref and Datacite provide a service to link publications and data. The easiest way for Crossref members to participate in this is to cite data using DataCite DOIs and to include them in the references within the metadata deposit. These data citations are automatically detected. Alternatively and/or additionally, Crossref members can deposit data citations (regardless of identifier) as a relation type in the metadata. Data & software citations from both methods are freely propagated.

The article nexus: linking publications to associated research outputs

Crossref began its service by linking publications to other publications via references. Today, this extends to relationships with associated entities. People (authors, reviewers, editors, other collaborators), funders, and research affiliations are important players in this story. Other metadata also figure prominently in it as well: references, licenses and access indicators, publication history (updates, revisions, corrections, retractions, publication dates), clinical trial and study information, etc. The list goes on. What is lesser known (and utilized) is that Crossref is increasingly linking publications to associated scholarly artifacts.

Crossref to Auto-Update ORCID Records

In the next few weeks, authors with an ORCID iD will be able to have Crossref automatically push information about their published work to their ORCID record. It’s something that ORCID users have been asking for and we’re pleased to be the first to develop the integration. 230 publishers already include ORCID iDs in their metadata deposits with us, and currently there are 248,000 DOIs that include ORCID iDs.