RuDiK: Rule Discovery in Knowledge Bases (demo paper)
Published in the 44th International Conference on Very Large Databases, VLDB, 2018
Abstract
RuDiK is a system for the discovery of declarative rules over knowledge-bases (KBs). RuDiK discovers both positive rules, which identify relationships between entities, e.g., “if two persons have the same parent, they are siblings”, and negative rules, which identify data contradictions, e.g., “if two persons are married, one cannot be the child of the other”. Rules help domain experts to curate data in large KBs. Positive rules suggest new facts to mitigate incompleteness and negative rules detect erroneous facts. Also, negative rules are useful to generate negative examples for learning algorithms. RuDiK goes beyond existing solutions since it dis- covers rules with a more expressive rule language w.r.t. previous approaches, which leads to wide coverage of the facts in the KB, and its mining is robust to existing errors and incompleteness in the KB. The system has been deployed for multiple KBs, including Yago, DBpedia, Freebase and WikiData, and identifies new facts and real errors with 85% to 97% accuracy, respectively. This demonstration shows how RuDiK can be used to interact with domain experts. Once the audience pick a KB and a predicate, they will add new facts, remove errors, and train a machine learning system with automatically generated examples.