This blog post is from Lettie Conrad and Michelle Urberg, cross-posted from the The Scholarly Kitchen.
As sponsors of this project, we at Crossref are excited to see this work shared out.
The scholarly publishing community talks a LOT about metadata and the need for high-quality, interoperable, and machine-readable descriptors of the content we disseminate. However, as we’ve reflected on previously in the Kitchen, despite well-established information standards (e.g., persistent identifiers), our industry lacks a shared framework to measure the value and impact of the metadata we produce.
When Crossref began over 20 years ago, our members were primarily from the United States and Western Europe, but for several years our membership has been more global and diverse, growing to almost 18,000 organizations around the world, representing 148 countries.
As we continue to grow, finding ways to help organizations participate in Crossref is an important part of our mission and approach. Our goal of creating the Research Nexus—a rich and reusable open network of relationships connecting research organizations, people, things, and actions; a scholarly record that the global community can build on forever, for the benefit of society—can only be achieved by ensuring that participation in Crossref is accessible to all.
In August 2022, the United States Office of Science and Technology Policy (OSTP) issued a memo (PDF) on ensuring free, immediate, and equitable access to federally funded research (a.k.a. the “Nelson memo”). Crossref is particularly interested in and relevant for the areas of this guidance that cover metadata and persistent identifiers—and the infrastructure and services that make them useful.
Funding bodies worldwide are increasingly involved in research infrastructure for dissemination and discovery.
Preprints have become an important tool for rapidly communicating and iterating on research outputs. There is now a range of preprint servers, some subject-specific, some based on a particular geographical area, and others linked to publishers or individual journals in addition to generalist platforms. In 2016 the Crossref schema started to support preprints and since then the number of metadata records has grown to around 16,000 new preprint DOIs per month.
To work out which version you’re on, take a look at the website address that you use to access iThenticate. If you go to ithenticate.com then you are using v1. If you use a bespoke URL, https://crossref-[your member ID].turnitin.com/ then you are using v2.
Within a folder, the Documents tab shows all the submitted documents for that folder.
Each document submitted generates a Similarity Report after the document has been through the Similarity Check. If more documents are present than can be displayed at once, the pages feature will appear beneath the documents - click the page number to display, or click Next to move to the next page of documents.
zip file upload - to submit a zip file containing multiple documents, up to a maximum of 100MB or 1,000 files. Larger files may take longer to upload
cut & paste - to submit text directly into the submission box. Use this to copy and paste a submission from a file format that is not supported. This method supports plain text only (no images or non-text information)
iThenticate currently accepts the following file types for document upload:
Microsoft Word® (.doc and .docx)
Word XML
plain text (.txt)
Adobe PostScript®
Portable Document Format (.pdf)
HTML
Corel WordPerfect® (.wpd)
Rich Text Format (.rtf)
Each file may not exceed 400 pages, and each file size may not exceed 100 MB. Reduce the size of larger files by removing non-text content. You can’t upload or submit to iThenticate files that are password-protected, encrypted, hidden, system files, or read-only.
.pdf documents must contain text - if they contain only images of text, they will be rejected during the upload attempt. To check, copy and paste a section of the .pdf into a plain-text editor such as Microsoft Notepad® or Apple TextEdit®. If no text is copied over, the selection does not contain text.
To convert scanned images of a document, or an image saved as a .pdf, use Optical Character Recognition (OCR) software to convert the image to text. The conversion software can introduce errors, so manually check and correct the converted document.
Some document formats can contain multiple data types, such as text, images, embedded information from another file, and formatting. Non-text information that is not saved directly within the document will not be included in a file upload, for example, references to a Microsoft Excel® spreadsheet included within a Microsoft Office Word® document.
Use a word-processing program to save your file as one of the accepted types listed above, such as .rtf or .txt. Neither file type supports images or non-text data within the file. Plain text format does not support any formatting, and rich text format allows only limited formatting.
When converting a file to a new format, save it with a different name from the original, to avoid accidentally overwriting the original file. This is especially important when converting to plain text or rich text formats, to prevent permanent loss of the original formatting or image content of the file.
Page owner: Kathleen Luschek | Last updated 2020-May-19