Dominika Tkaczyk

Dominika Tkaczyk

Director of Data Science

Biography

Dominika joined Crossref’s R&D team in August 2018 as a Principal R&D Developer. Within her first few years at Crossref, she focused primarily on the research and development of metadata matching strategies, to enrich the Research Nexus network with new relationships. In 2024 Dominika became Crossref’s Director of Data Science and launched the Data Science Team. The goal of the Data Science Team is to explore new possibilities for using the data to serve the scholarly community, continue the enrichment of the scholarly record with more metadata and relationships, and develop strong collaborations with like-minded community initiatives. Before joining Crossref, Dominika was a researcher and a data scientist at the University of Warsaw, Poland, and a postdoctoral researcher at Trinity College Dublin, Ireland. She received a PhD in Computer Science from the Polish Academy of Sciences in 2016 for her research on metadata extraction from full-text documents using machine learning and natural language processing techniques.

ORCID iD

0000-0001-5055-7876

Dominika Tkaczyk's Latest Blog Posts

Metadata matching: beyond correctness

Dominika Tkaczyk, Wednesday, Jan 8, 2025

In MetadataLinkingMetadata MatchingData Science

Leave a comment

https://0-doi-org.libus.csd.mu.edu/10.13003/axeer1ee In our previous entry, we explained that thorough evaluation is key to understanding a matching strategy’s performance. While evaluation is what allows us to assess the correctness of matching, choosing the best matching strategy is, unfortunately, not as simple as selecting the one that yields the best matches. Instead, these decisions usually depend on weighing multiple factors based on your particular circumstances. This is true not only for metadata matching, but for many technical choices that require navigating trade-offs.

How good is your matching?

Dominika Tkaczyk, Wednesday, Nov 6, 2024

In MetadataLinkingMetadata MatchingData Science

Leave a comment

https://0-doi-org.libus.csd.mu.edu/10.13003/ief7aibi In our previous blog post in this series, we explained why no metadata matching strategy can return perfect results. Thankfully, however, this does not mean that it’s impossible to know anything about the quality of matching. Indeed, we can (and should!) measure how close (or far) we are from achieving perfection with our matching. Read on to learn how this can be done! How about we start with a quiz?

The myth of perfect metadata matching

Dominika Tkaczyk, Wednesday, Aug 28, 2024

In MetadataLinkingMetadata MatchingData Science

Leave a comment

https://0-doi-org.libus.csd.mu.edu/10.13003/pied3tho In our previous instalments of the blog series about matching (see part 1 and part 2), we explained what metadata matching is, why it is important and described its basic terminology. In this entry, we will discuss a few common beliefs about metadata matching that are often encountered when interacting with users, developers, integrators, and other stakeholders. Spoiler alert: we are calling them myths because these beliefs are not true!

The anatomy of metadata matching

Dominika Tkaczyk, Thursday, Jun 27, 2024

In MetadataLinkingMetadata MatchingData Science

Leave a comment

https://0-doi-org.libus.csd.mu.edu/10.13003/zie7reeg In our previous blog post about metadata matching, we discussed what it is and why we need it (tl;dr: to discover more relationships within the scholarly record). Here, we will describe some basic matching-related terminology and the components of a matching process. We will also pose some typical product questions to consider when developing or integrating matching solutions. Basic terminology Metadata matching is a high-level concept, with many different problems falling into this category.

Metadata matching 101: what is it and why do we need it?

Dominika Tkaczyk, Thursday, May 16, 2024

In MetadataLinkingMetadata MatchingData Science

Leave a comment

https://0-doi-org.libus.csd.mu.edu/10.13003/aewi1cai At Crossref and ROR, we develop and run processes that match metadata at scale, creating relationships between millions of entities in the scholarly record. Over the last few years, we’ve spent a lot of time diving into details about metadata matching strategies, evaluation, and integration. It is quite possibly our favourite thing to talk and write about! But sometimes it is good to step back and look at the problem from a wider perspective.

Read all of Dominika Tkaczyk's posts »