Blog

 4 minute read.

You’ve had your say, now what? Next steps for schema changes

It seems like ages ago, particularly given recent events, but we had our first public request for feedback on proposed schema updates in December and January. The feedback we received indicated two big things: we’re on the right track, and you want us to go further. This update has some significant but important changes to contributors, but is otherwise a fairly moderate update. The feedback was mostly supportive, with a fair number of helpful suggestions about details.

Feedback and changes

Many of you are excited about CRediT, and a number of members have indicated that they are ready and waiting to send us CRediT roles. To support this, as in my initial proposal, we’re adding a new role element and role_type attribute that supports existing Crossref-defined roles and CRediT roles, as well as a required vocab attribute to specify which vocabulary is being supplied.

<role role_type="author" vocab="crossref">author</role> <role role_type="writing-original_draft" vocab="credit"/>

CRediT as it exists now is an informal standard coordinated by CASRAI, but a formal standard is in the works via NISO. CRediT is currently a list of well considered and defined roles that are not particularly machine-readable. I’ve created a list for implementation that eliminates spaces and ampersands. CRediT also lacks reliable PIDs or persistent URLs for the role definitions, so that has been omitted from our implementation. We’ll adopt any changes resulting from the NISO standard, but have decided to go forward with it as-is, as many of our members are eager to implement.

Beyond CRediT, we’ll also be expanding and refining our contributor support in a number of ways:

  • We’ll be expanding our affiliation metadata beyond a simple string to include organization identifiers like ROR, and allow markup of organization names and locations.
  • We’re expanding the contributor identifiers as well - in addition to ORCID iDs, members can send us Wikidata, ISNI, and other identifiers.
  • We’re adding support for multiple names to support contributors whose names can be expressed in multiple alphabets, or who have aliases or nicknames.
  • We’re changing surname to family_name and will be relaxing the requirement that all person names have a “surname” - a given name may be supplied on its own to support contributors who do not have family names.
  • The current element for corporate/group authors, organization, will be replaced by collab as the term “organization” was widely confusing (we have a lot of affiliation info registered as group authors!), and the collab section will also allow organization identifiers.

Many of these updates align with how JATS supports contributors - I hope these changes will allow our members to supply robust contributor metadata without the burden of complicated conversions.

I’m also including the proposed changes to support data citation and typing of citations. Additionally, we’ll be adding support for members who want to:

A draft 5.0 xsd file is available in a branch of our GitLab schema repository with the details of the planned updates, and more robust documentation and examples are forthcoming.

Implementation plans

My house was built in 1890 and there are always surprises whenever we need to fix or renovate anything. Our system is just as old in technology years - it’s been chugging along since the aughts. This means while we don’t think it’s powered by knob-and-tube wiring, we can’t be sure until we open up the walls. We want to implement our plans (in fact we want to do more!) but if we run into any big blockers or crucial issues, we may roll out the changes over several iterations. These updates are fairly conservative and I remain optimistic we’ll be able to implement them as-is. Our update will help us build a foundation for future updates, allowing us to continuously evolve our schema as we move forward.

Some of you are understandably worried about our implementation schedule and backwards incompatibility. We’re aware that changes are expensive and inconvenient, and making them on our schedule doesn’t always work for your schedule. That’s why we’ve sustained 12+ versions of our schema over the past 12 years. We won’t be mandating a change any time soon, and definitely won’t do so without sufficient warning and community involvement. In the future we’ll need to make a sustained effort to retire older schema, but now isn’t the time for that.

We intend to commence work in Q2 but won’t have a firm timeline for a few more weeks. I will be providing regular updates as we progress, and will be asking for volunteers to test the updates when we’re ready. I’ll also be sharing more documentation and information about how the changes will be represented in our metadata outputs.

Have more to say?

Our feedback period has finished and we do plan to implement the changes as described, but if you have opinions, please share them.

Related pages and blog posts

Page owner: Patricia Feeney   |   Last updated 2020-April-02