Quality control tools and procedures

  • Perception: Anybody can edit anything - so Wikidata is no reliable source of knowledge
  • Seen as a threat for information systems based on Wikidata
    • particularly by some large Wikipedias (e.g., the English one)
  • Basic policy to address this: Statements should be referenced

QA support for editors

  • Contraint definition for properties
    • raise warnings during data input, when, e.g.
      • a format definition (ISBN, DOI etc.) is violated
      • a supposedly unique identifier is added to more than one item
    • generated lists of constraint violations (e.g. ZDB ID format)
  • Constraints can be very helpful, but do not cover complex cases

More QA support for editors

Revision control and patroling

  • Versioned edits and version control
  • Manual and tool supported vandalism prevention
  • Watchlists
  • Automated flagging of suspect edits (e.g., “new editor deleteing statements”)
  • Patroling
  • Technically very easy to revert edits
  • Semi-protection or protection of oftenly-vandalized items

revision history gandhi

Automated tools for vandalism detection

  • Fighting to keep up with rate of human edits in Wikidata (multiple per second)
  • … requires reducing the manual workload, e.g. via
  • and other rule-based and machine-learning tools

Ongoing research