Crossref Public Data File

You have an alternative to the REST API if your goal is to obtain the entire body of Crossref’s records. With the public data file you have access to every DOI ever registered with Crossref.

2024 Public data file

Tips for working with Crossref public data files and Plus snapshots

Important considerations:

  • The records are in JSON.
  • Metadata is supplied by our members and, as such, not all records have the same completeness or quality of metadata.
  • Every year our metadata corpus grows. The first data file was 65GB and held 112 million records; this year the file weighs in at 212 GB and contains metadata for 156 million records, or all Crossref records registered up to and including April 2024.
  • Decompressing the .tar.gz files will take you several hours.

Additional convenient tools:

Given the size and the amount of files that the public data files comprises, we started experimenting with some additional tools to improve access to the data and repack the data into additional formats:

These tools are experimental, so please remember that we they are released without warranties and support, but are happy to hear about your experience using them. You can read more about them in this blog post

Previous releases

(Click to expand)

+- 2023
+- 2022
+- 2021
+- 2020

Page owner: Luis Montilla   |   Last updated 2024-March-07