The Crossref Curriculum

Transparency of Event Data

Event Data can be considered as a stream of assertions about research objects. When you interpret an assertion, you should know who made the assertion and which data they were working from. For interpretation, you may want to accept some events at face value but independently verify others. For this reason we make every effort to be open and transparent with Event Data. We do this in several ways.

Open source code

All Event Data code is open source and available from Crossref’s Gitlab repository - learn more in our Knowledge Database. Crossref is a community-focused membership organisation and we welcome contributions to the code, as well as enabling others to make use of the code to gather events they might be interested in themselves.

Event Data uses lists of attributes called artifacts. Examples include a list of news websites, and landing domains of publishers. They are used as inputs for the function of Event Data and are, of course, completely open. They are versioned so you can see which artefact was used when a given event was recorded. See the current list of artifacts.

Logs, logs, logs

We take an evidence-first approach to providing event data. If an event is created using external data, we create an evidence record. This maps the journey from finding a mention online to associating it with a DOI. A link to the evidence record is included in the metadata for each event.

Agents also create evidence logs to record their activity. This is typically a list of the pages they have visited and any potential events they found, even if they were eventually not added to Event Data. Learn more about how to access the evidence logs.

The logs typically run to over a gigabyte of data each day and provide a comprehensive record of Event Data provenance.

Page owner: Laura J. Wilkinson   |   Last updated 2020-April-08