Wikidata as linking hub

Joachim Neubert

The idea of linking hubs

Connect concepts via identifiers/URLs

Image by Jakob Voss

Existing hubs: VIAF, sameAs.org, …

Different linking properties

  1. exact match (datatype URL)
    generic link to URL in the meaning of skos:exactMatch
  2. Pxxxx: more than 4000 specialized properties (datatype external identifier)

Examples for external identifiers

  • GND / VIAF identifiers
  • geogaphical entities
  • proteins
  • Swedish cultural heritage objects
  • African plants
  • baseball players
  • TED conference speakers

Property definitions

  • subject item for the property
  • examples
  • constraints on values, cardinality, etc.
  • formatter URL: creates a clickable link for the ID
  • start at the property page, e.g., for the ISSN: https://www.wikidata.org/wiki/Property:P236

Property Documentation

ISSN property page

ISSN property page

Beyond sameness - mapping relations

  • Wikidata external ids imply “sameness” of linked concepts
  • even with geographic names, other mapping relations are required in some cases.
  • examples:
    • close matches, e.g., “Yugoslavia” (1918-1992) (Wikidata) ≅ “Yugoslavia (until 1990)” (STW)
    • related matches, e.g. a company and its founder

Mapping relation type (P4390)

  • introduced after a community discussion in October 2017
  • to be used as qualifier for external id entries
  • fixed value set - SKOS mapping relations (exact, close, broad, narrow, related match)

Example at item Assessment center

How does that relate to the Linked Data model?

Internal data model and storage (Wikibase) is transformed to RDF for: - RDF dumps - Query Service

RDF linking from Wikidata

Federated SPARQL queries

Example use case: GND authority has information about the professions/occupations of people which is not known in Wikidata.

So get that information dynamically from a GND SPARQL endpoint.

Here, we are interested in economists, in particular.

From Wikidata to a remote endpoint

query to WDQS

From a remote endpoint to Wikidata

query to GND endpoint <== not working currently

Several points for attention

  • Direction and sequence of statements often matters for performance
  • To reach out from Wikidata, endpoints have to be approved (full list)
  • In the other direction, access is normally not restricted
  • Some federated queries get extremely slow, when large sets of bindings exist before the remote service is invoked
  • be sure to exclude variables bound to blank nodes (‘unknown value’ in Wikidata)

Further reading on Wikidata/RDF

Application process for a new property

example stw id

Hints for getting it approved smoothly

  • Clearly lay out the motivation and planned use for the property
  • Provide working examples (with the formatter URI you are suggesting)
  • Be responsive to comments

Wikidata as a universal linking hub

  • easy extensibility with new properties for external identifiers
  • immense fund of existing items, with the full set of SKOS mapping relations for more or less exact mappings to these
  • immediate extensibility with new items