Blog

Drawing on the Research Nexus with Policy documents: Overton’s use of Crossref API

Update 2024-07-01: This post is based on an interview with Euan Adie, founder and director of Overton._ What is Overton? Overton is a big database of government policy documents, also including sources like intergovernmental organizations, think tanks, and big NGOs and in general anyone who’s trying to influence a government policy maker. What we’re interested in is basically, taking all the good parts of the scholarly record and applying some of that to the policy world.

2024 public data file now available, featuring new experimental formats

This year’s public data file is now available, featuring over 156 million metadata records deposited with Crossref through the end of April 2024 from over 19,000 members. A full breakdown of Crossref metadata statistics is available here. Like last year, you can download all of these records in one go via Academic Torrents or directly from Amazon S3 via the “requester pays” method. Download the file: The torrent download can be initiated here.

Subject codes, incomplete and unreliable, have got to go

Patrick Polischuk

Patrick Polischuk – 2024 March 13

In MetadataAPIs

Subject classifications have been available via the REST API for many years but have not been complete or reliable from the start and will soon be deprecated. dfdfd The subject metadata element was born out of a Labs experiment intended to enrich the metadata returned via Crossref Metadata Search with All Subject Journal Classification codes from Scopus. This feature was developed when the REST API was still fairly new, and we now recognize that the initial implementation worked its way into the service prematurely.

Metadata Retrieval

Analyse Crossref metadata to inform and understand research Crossref is the sustainable source of community-owned scholarly metadata and is relied upon by thousands of systems across the research ecosystem and the globe. Some of the typical users (outer) and uses (inner) of Crossref metadata Show image × People using Crossref metadata need it for all sorts of reasons including metaresearch (researchers studying research itself such as through bibliometric analyses), publishing trends (such as finding works from an individual author or reviewer), or incorporation into specific databases (such as for discovery and search or in subject-specific repositories), and many more detailed use cases.

Increasing Crossref Data Reusability With Format Experiments

Martin Eve

Martin Eve – 2024 January 19

In MetadataCommunityAPIs

Every year, Crossref releases a full public data file of all of our metadata. This is partly a commitment to POSI and partly just what we do. We want the community to re-use our metadata and to find interesting ends to which they can be put! However, we have also recognized, for some time, that 170GB of compressed .tar.gz files, spread over 27,000 items, is not the easiest of formats with which to work.

2023 public data file now available with new and improved retrieval options

We have some exciting news for fans of big batches of metadata: this year’s public data file is now available. Like in years past, we’ve wrapped up all of our metadata records into a single download for those who want to get started using all Crossref metadata records. We’ve once again made this year’s public data file available via Academic Torrents, and in response to some feedback we’ve received from public data file users, we’ve taken a few additional steps to make accessing this 185 gb file a little easier.

2022 public data file of more than 134 million metadata records now available

In 2020 we released our first public data file, something we’ve turned into an annual affair supporting our commitment to the Principles of Open Scholarly Infrastructure (POSI). We’ve just posted the 2022 file, which can now be downloaded via torrent like in years past. We aim to publish these in the first quarter of each year, though as you may notice, we’re a little behind our intended schedule. The reason for this delay was that we wanted to make critical new metadata fields available, including resource URLs and titles with markup.

With a little help from your Crossref friends: Better metadata

We talk so much about more and better metadata that a reasonable question might be: what is Crossref doing to help? Members and their service partners do the heavy lifting to provide Crossref with metadata and we don’t change what is supplied to us. One reason we don’t is because members can and often do change their records (important note: updated records do not incur fees!). However, we do a fair amount of behind the scenes work to check and report on the metadata as well as to add context and relationships.

A ROR-some update to our API

Earlier this year, Ginny posted an exciting update on Crossref’s progress with adopting ROR, the Research Organization Registry for affiliations, announcing that we’d started the collection of ROR identifiers in our metadata input schema. 🦁 The capacity to accept ROR IDs to help reliably identify institutions is really important but the real value comes from their open availability alongside the other metadata registered with us, such as for publications like journal articles, book chapters, preprints, and for other objects such as grants.

New public data file: 120+ million metadata records

Jennifer Kemp

Jennifer Kemp – 2021 January 19

In MetadataCommunityAPIs

2020 wasn’t all bad. In April of last year, we released our first public data file. Though Crossref metadata is always openly available––and our board recently cemented this by voting to adopt the Principles of Open Scholarly Infrastructure (POSI)</agic––we’ve decided to release an updated file. This will provide a more efficient way to get such a large volume of records. The file (JSON records, 102.6GB) is now available, with thanks once again to Academic Torrents.