Wikidata as linking hub
Joachim Neubert
The idea of linking hubs
Connect concepts via identifiers/URLs
Image by Jakob
Voss
Existing hubs: VIAF, sameAs.org, …
Different linking properties
- exact
match (datatype URL)
generic link to URL in the meaning of
skos:exactMatch
- Pxxxx: more than 4000 specialized
properties (datatype external identifier)
Examples for external identifiers
- GND / VIAF identifiers
- geogaphical entities
- proteins
- Swedish cultural heritage objects
- African plants
- baseball players
- TED conference speakers
Property definitions
- subject item for the property
- examples
- constraints on values, cardinality, etc.
- formatter
URL: creates a clickable link for the ID
- start at the property page, e.g., for the ISSN:
https://www.wikidata.org/wiki/Property:P236
Property Documentation
![ISSN property page]()
![ISSN property page]()
Beyond sameness - mapping relations
- Wikidata external ids imply “sameness” of linked concepts
- even with geographic names, other mapping relations are required in
some cases.
- examples:
- close matches, e.g., “Yugoslavia” (1918-1992) (Wikidata) ≅
“Yugoslavia (until 1990)” (STW)
- related matches, e.g. a company and its founder
Mapping relation type (P4390)
- introduced after a community discussion in October 2017
- to be used as qualifier for external id entries
- fixed value set - SKOS mapping relations (exact, close, broad,
narrow, related match)
![]()
How does that relate to the Linked Data model?
Internal data model and storage (Wikibase) is transformed to RDF for:
- RDF
dumps - Query Service
RDF linking from Wikidata
Links in the RDF dumps
Output has full URLs to external resources, however with
Wikidata-specific properties:
wd:Q123 wdt:P234 "External-ID" ;
wdtn:P234 <http://example.com/reference/External-ID>
This creates a hurdle for generic Linked Data browsers and tools -
not even exact
match is translated to skos:exactMatch
Federated SPARQL queries
Example use case: GND authority has information about the
professions/occupations of people which is not known in Wikidata.
So get that information dynamically from a GND SPARQL endpoint.
Here, we are interested in economists, in particular.
From Wikidata to a remote
endpoint
query to WDQS
From a remote endpoint to
Wikidata
query
to GND endpoint <== not working currently
Several points for attention
- Direction and sequence of statements often matters for
performance
- To reach out from Wikidata, endpoints have to be approved
(full
list)
- In the other direction, access is normally not restricted
- Some federated queries get extremely slow, when large sets of
bindings exist before the remote service is invoked
- be sure to exclude variables bound to blank nodes (‘unknown
value’ in Wikidata)
Further reading on Wikidata/RDF
Application process for a new property
Hints for getting it
approved smoothly
- Clearly lay out the motivation and planned use for the property
- Provide working examples (with the formatter URI you are
suggesting)
- Be responsive to comments
Wikidata as a universal linking hub
- easy extensibility with new properties for external identifiers
- immense fund of existing items, with the full set of SKOS mapping
relations for more or less exact mappings to these
- immediate extensibility with new items