In our previous instalments of the blog series about matching (see part 1 and part 2), we explained what metadata matching is, why it is important and described its basic terminology. In this entry, we will discuss a few common beliefs about metadata matching that are often encountered when interacting with users, developers, integrators, and other stakeholders. Spoiler alert: we are calling them myths because these beliefs are not true! Read on to learn why.
We’ve just released an update to our participation report, which provides a view for our members into how they are each working towards best practices in open metadata. Prompted by some of the signatories and organizers of the Barcelona Declaration, which Crossref supports, and with the help of our friends at CWTS Leiden, we have fast-tracked the work to include an updated set of metadata best practices in participation reports for our members.
It’s been a while, here’s a metadata update and request for feedback In Spring 2023 we sent out a survey to our community with a goal of assessing what our priorities for metadata development should be - what projects are our community ready to support? Where is the greatest need? What are the roadblocks?
The intention was to help prioritize our metadata development work. There’s a lot we want to do, a lot our community needs from us, but we really want to make sure we’re focusing on the projects that will have the most immediate impact for now.
In the first half of this year we’ve been talking to our community about post-publication changes and Crossmark. When a piece of research is published it isn’t the end of the journey—it is read, reused, and sometimes modified. That’s why we run Crossmark, as a way to provide notifications of important changes to research made after publication. Readers can see if the research they are looking at has updates by clicking the Crossmark logo.
Funding metadata must include the name of the funding organization and the funder identifier (where the funding organization is listed in the Registry), and should include an award/grant number or grant identifier. Funder names should only be deposited without the accompanying ID if the funder is not found in the Registry. While members can deposit the funder name without the identifier, those records will not be considered valid until such a time as the funder is added to the database and they are redeposited (updated) with an ID. What that means is that they will not be found using the filters on funding information that we support via our REST API, or show up in our Open Funder Registry search.
Correct nesting of funder names and identifiers is essential as it significantly impacts how funders, funder identifiers, and award numbers are related to each other.
The purpose of funder groups is to establish relationships between funders and award numbers. A funder group assertion should only be used to associate funder names and identifiers with award numbers when multiple funders are present.
Funding data deposit with one group of funders (no “fundgroup” needed):
Funding data deposit with two fundgroups:
Incorrect: Groups used to associate funder names with funder identifiers, these need to be nested as described above.
Deposits using a funder_identifier that is not taken from the Open Funder Registry will be rejected.
Deposits with only funder_name (no funder_identifier) will not appear in funder search results in Open Funder Registry search or the REST API.
The <fr:program> element in the deposit schema section (see documentation) supports the import of the fundref.xsd schema (see documentation). The fundref namespace (xmlns:fr=https://0-www-crossref-org.libus.csd.mu.edu/fundref.xsd) must be included in the schema declaration, for example:
To accommodate integration with Crossmark, the fundref.xsd consists of a series of nested <fr:assertion> tags with enumerated name attributes. The name attributes are:
fundgroup: used to group a funder and its associated award number(s) for items with multiple funders.
funder_name: name of the funding agency as it appears in the funding Registry. Funder names that do not match those in the registry will be accepted to cover instances where the funding organization is not listed.
funder_identifier: funding agency identifier in the form of a DOI, must be nested within the funder_name assertion. The funder_identifier must be taken from the funding Registry and cannot be created by the member. Deposits without funder_identifier do not qualify as funding records.
award_number: grant number or other fund identifier
funder_nameandfunder_identifier must be present in a deposit where the funding body is listed in the Open Funder Registry. Multiple funder_name, funder_identifier, and award_number assertions may be included.
A relationship between funder_identifier and funder_name is established by nesting funder_identifier within funder_name. For example, this deposit has the funder National Science Foundation with its corresponding funder identifier in the Open Funder Registry of https://0-doi-org.libus.csd.mu.edu/10.13039/100000001 :
<fr:assertion name="funder_name">National Science Foundation
<fr:assertion name="funder_identifier">https://0-doi-org.libus.csd.mu.edu/10.13039/100000001</fr:assertion>
</fr:assertion>
A relationship between a single funder_name and/or funder_identifier and an award_number is established by including assertions with a <fr:program>. In this example, funder National Institute on Drug Abuse with funder identifier https://0-doi-org.libus.csd.mu.edu/10.13039/100000026 are associated with award number JQY0937263:
<fr:program name="fundref">
<fr:assertion name="funder_name">National Institute on Drug Abuse
<fr:assertion name="funder_identifier">https://0-doi-org.libus.csd.mu.edu/10.13039/100000026</fr:assertion>
</fr:assertion>
<fr:assertion name="award_number">JQY0937263</fr:assertion>
</fr:program>
If multiple funder and award combinations exist, each combination should be deposited within a fundgroup to ensure that the award number is associated with the appropriate funder(s). In this example, two funding groups exist:
<fr:program name="fundref">
<fr:assertion name="fundgroup">
<fr:assertion name="funder_name">National Science Foundation
<fr:assertion name="funder_identifier">https://0-doi-org.libus.csd.mu.edu/10.13039/100000001</fr:assertion>
</fr:assertion>
<fr:assertion name="award_number">CBET-106</fr:assertion>
<fr:assertion name="award_number">CBET-7259</fr:assertion>
</fr:assertion>
<fr:assertion name="fundgroup">
<fr:assertion name="funder_name">Basic Energy Sciences, Office of Science, U.S. Department of Energy
<fr:assertion name="funder_identifier">https://0-doi-org.libus.csd.mu.edu/10.13039/100006151</fr:assertion>
</fr:assertion>
<fr:assertion name="award_number">1245-ABDS</fr:assertion>
</fr:assertion>
</fr:program>
Items with multiple funder names but no award numbers may be deposited without a fundgroup.
At a minimum, a funding data deposit must contain a funder_name and funder_identifier assertion. Deposits with just an award_number assertion are not allowed. A funder_name, funder_identifier, and award_number should be included in deposits whenever possible.
If the funder name cannot be matched in the Registry, you may submit funder_name only, and the funding body will be reviewed and considered for addition to the official Registry. Until it is added to the Registry, the deposit will not be considered a valid funding record and will not appear in funding search or the REST API.
As demonstrated in Example 3 below, items with several award numbers associated with a single funding organization should be grouped together by enclosing the funder_name, funder_identifier, and award_number(s) within a fundgroup assertion.
Some rules will be enforced by the deposit logic, including:
Nesting of the<fr:assertion>elements: the schema allows infinite nesting of the assertion element to accommodate nesting of an element within itself. Deposit code will only allow 3 levels of nesting (with attribute values of fundgroup, funder_name, and funder_identifier)
Values of different<fr:assertion>elements: funder_name, funder_identifier, and award_number may have deposit rules imposed
Only valid funder identifiers will be accepted: the funder_identifier value will be compared against the Open Funder Registry file. If the funder_identifier is not found, the deposit will be rejected.
If funding metadata is incorrect or out-of-date, it may be updated by redepositing the metadata. Be sure to redeposit all available metadata for an item, not just the elements being updated. A DOI may be updated without resubmitting funding metadata, as previously deposited funding metadata will remain associated with the DOI.
Funding metadata may be deleted by redepositing an item with an empty <fr:program name="fundref"> element:
Submitting an empty Crossmark tag (<crossmark />) will delete all Crossmark data, including funding data. To delete only funding data, submit an empty <fr:program name="fundref"/> element:
Example 2: Funder information outside of Crossmark` ``
The <fr:program> element captures funding data. It should be placed before the <doi_data> element. This deposit contains minimal funding data - one funder_name or one funder_identifier must be present; both are recommended.
<fr:program name="fundref">
<fr:assertion name="funder_name">National Science Foundation
<fr:assertion name="funder_identifier">https://0-doi-org.libus.csd.mu.edu/10.13039/100000001</fr:assertion> </fr:assertion>
</fr:program>
This example contains one funder_name and one funder_identifier. Note that the funder_identifier is nested within the funder_name assertion, establishing https://0-doi-org.libus.csd.mu.edu/10.13039.100000001 as the funder identifier for funder name National Science Foundation. Two award numbers are present.