Schloss Dagstuhl becomes part of the National Research Data Infrastructure for Data Science and Artificial Intelligence

On July 2, 2021, the German Joint Science Conference (Gemeinsame Wissenschaftskonferenz, GWK) decided to fund the National Research Data Infrastructure (NFDI) consortium for Data Science and Artificial Intelligence (NFDI4DataScience) with an amount in the double-digit millions over a duration of five years. In this consortium, Schloss Dagstuhl has joined forces with numerous other leading research infrastructure providers in Germany. The NFDI is a collaborative, nationwide network to systematically index, interconnect, and make openly available the valuable stock of data from science and research. Dagstuhl’s renown research infrastructures – supporting research itself, the publication and dissemination of research results, and finding and reusing them – will be further developed, expanded, and integrated as part of NFDI. The results of this consortium Read more…

New dblp URL scheme and API updates

A big change has just been made to the dblp website … and, in case we did our job right, you may even haven’t noticed yet: With the latest update, we introduced major changes to the dblp URL scheme. In particular, this applies to the URLs of all author bibliographies listed on dblp, which are now served under a new and persistent URL. But don’t worry, just like the first time we made such a change about eight years ago, we try to keep all previously existing URLs as a redirect for the foreseeable future. In this post, we talk about the reasons that made us abandon our old URL scheme and why you will most likely want to update Read more…

dblp and ORCID 2020

In the past, we often discussed how helpful ORCIDs are for our work. An ORCID (Open Researcher and Contributor ID) is a unique personal identifier that scientists can attach to their work. The ORCID ensures that this work is linked to the correct scientist an not to someone else with the same or similar name. We at dblp use ORCIDs to create clean bibliographies. A bibliography should list the work of a single researcher and of course a unique identifier is very helpful here. In this post I will give a short overview on how we handle ORCID and how prevalent it is in DBLP just now. If you do not have an ORCID, consider getting one (for free) at orcid.org. Please make sure that it is attached to your publications whenever possible.

We started experimenting with ORCID in 2016. A more complex integration began in 2017 when we also started to show ORCIDs in bibliographies and individual publications. At the same time we made ORCIDs available with our data releases. We obtain most ORCIDs directly from the publishers together with other publication meta data such as title and author names. ORCID was established in 2012 and many publishers started to attach ORCIDs to their publications only recently (or do not do that at all). But authors can claim such works on their own. This information is provided by ORCID via their annual data dump which we also map to our data set. This means that ORCID has become a common type of data in our collection. Below you see the fraction of signatures in dblp for which an ORCID is known. A signature is a pair of author name and paper. So a paper with five authors has five signatures.

Fraction of signatures with ORCID

An ORCID is now available for 12% of all our signatures and that number is going up. At the moment, we add ORCIDs to dblp in batches. This means that a publication can appear in dblp without any ORCIDs. A few days later they are added. We are working to streamline this process for a faster integration.

(more…)

Dr. Michael Ley to receive the ACM Distinguished Service Award

The world’s largest computing society, the Association for Computer Machinery (ACM), has bestowed its prestigious ACM Distinguished Service Award 2019 on computer scientist Dr. Michael Ley of Schloss Dagstuhl Leibniz-Center for Informatics and of Trier University. ACM thus recognizes Dr. Ley’s achievement in the creation and unceasing editorial curation of the dblp computer science bibliography. Dr. Ley has developed dblp from a small and initially highly specialized collection of metadata about scholarly publications in the fields of “data bases (db) and logic programming (lp)” into the most comprehensive, open bibliographic information service for all disciplines of computer science. The database was created by Dr. Ley at the University of Trier in 1993. Today, it is operated by Dr. Ley and Read more…

dblp computer science bibliography surpasses 5 million publications

On March 23rd, 2020, the dblp computer science bibliography indexed its 5 millionth publication. By doing so, the world’s largest openly accessible metadata collection of computer science publications doubled in size during the course of just six years. Thus, dblp consolidates its role as an export hit from Germany, which is of world renown among the international computer science community. Modern research requires the immediate and comprehensive access to current publications to meet the needs of an ever faster evolving and ever more complex research landscape. However, high-quality metadata and information about recent publications are often quite difficult to obtain. Search engines like Google allow a broad insight into the Internet but have neither guarantees of data quality, nor completeness, Read more…

Name disambiguation suffixes in dblp

At the end of March 2020 dblp provides bibliographies for almost 2.5 million scientists. With this number, it is not surprising that we have namesakes – scientists with the exact same name. For historical reasons, all persons in dblp must have different names. We circumvent this problem by assigning numeric suffixes to names that are not unique. E.g., there are multiple Thomas Müller in dblp. So we name them Thomas Müller 0001, Thomas Müller 0002 and so on. See our FAQ here for more details. Identifying authors with the same name is a  very important task. For example, the bibliography of Thomas Müller 0001 should not list papers by another Thomas Müller. This is a very common problem. I recently Read more…

Corrections in dblp

Our primary goal is to ensure that bibliographies (list of publications) of authors in dblp are correct. This means that all publications of a person should be listed in the same list and that a list should contain only publications from one specific person. It can be difficult to ensure this and despite our best efforts, we assign publications to the wrong publication list. Because of this, we frequently check our data set and correct mistakes. The following figure shows the number of corrections we made in the last twenty years. In a merge correction, two (or more) publication lists are merged. E.g., we discover that A. Jones and Adam Jones are the same person. A split fixes a defect where Read more…

Unpaywalled article links

The dblp computer science bibliography provides more than 5 million hyperlinks for research publications. Most of those links point to article landing pages within a publisher’s digital library. A growing number of publishers have adopted the open access model of publishing, thereby allowing the dissemination of research results free of cost and without any access barrier. You may have noticed that we have recently begun to mark such hyperlinks in dblp with a special orange badge signalling their availability. (Please also note that this badge is still work in progress, and that there are still plenty of openly accessible articles in dblp that go unrecognized.) However, most publishers in computer science do still demand an active subscription or a fee Read more…

License change to CC0

Starting today, all of dblp’s data will be released under the CC0 1.0 Creative Commons Public Domain License. This affects all metadata releases, in particular the daily and monthly data dumps and data retrieved from the web APIs. This change will make it easier for you to reuse our data. In a nutshell, you can use our data without asking permission, for any purpose (including commercial purposes), and even without attributing it to us. However, we will very much appreciate if you mention dblp as the source of the data set and/or if you provide a hyperlink to https://dblp.org. Our previous license, the ODC-BY 1.0 Open Data Commons Attribution License, was selected as a fitting license back in 2011, when Read more…