Thu 24 Oct 2024 11:00 - 11:20 at IBR West - Datalog Chair(s): John Regehr

Datalog is a popular and widely-used declarative logic programming language. Datalog engines apply many cross-rule optimizations; bugs in them can cause incorrect results. To detect such optimization bugs, we propose an automated testing approach called \emph{Incremental Rule Evaluation (IRE)}, which synergistically tackles the test oracle and test case generation problem. The core idea behind the test oracle is to compare the results of an optimized program and a program without cross-rule optimization; any difference indicates a bug in the Datalog engine. Our core insight is that, for an optimized, incrementally-generated Datalog program, we can evaluate all rules individually by constructing a reference program to disable the optimizations that are performed among multiple rules. Incrementally generating test cases not only allows us to apply the test oracle for every new rule generated—we also can ensure that every newly added rule generates a non-empty result with a given probability and eschew recomputing already-known facts. We implemented IRE as a tool named Deopt, and evaluated Deopt on four mature Datalog engines, namely Soufflé, CozoDB, μZ, and DDlog, and discovered a total of 30 bugs. Of these, 13 were logic bugs, while the remaining were crash and error bugs. Deopt can detect all bugs found by queryFuzz, a state-of-the-art approach. Out of the bugs identified by Deopt, queryFuzz might be unable to detect 5. Our incremental test case generation approach is efficient; for example, for test cases containing 60 rules, our incremental approach can produce 1.17× (for DDlog) to 31.02× (for Soufflé) as many valid test cases with non-empty results as the naive random method. We believe that the simplicity and the generality of the approach will lead to its wide adoption in practice.

Thu 24 Oct

Displayed time zone: Pacific Time (US & Canada) change

10:40 - 12:20
DatalogOOPSLA 2024 at IBR West
Chair(s): John Regehr University of Utah
10:40
20m
Talk
A Typed Multi-Level Datalog IR and its Compiler Framework
OOPSLA 2024
David Klopp JGU Mainz, Sebastian Erdweg JGU Mainz, André Pacak JGU Mainz
DOI
11:00
20m
Talk
Finding Cross-rule Optimization Bugs in Datalog Engines
OOPSLA 2024
Chi Zhang Nanjing University, Linzhang Wang Nanjing University, Manuel Rigger National University of Singapore
DOI
11:20
20m
Talk
Making Formulog Fast: An Argument for Unconventional Datalog EvaluationOOPSLA 2024 Distinguished Artifact Award
OOPSLA 2024
Aaron Bembenek University of Melbourne, Michael Greenberg Stevens Institute of Technology, Stephen Chong Harvard University
DOI Pre-print
11:40
20m
Talk
Object-Oriented Fixpoint Programming with Datalog
OOPSLA 2024
David Klopp JGU Mainz, Sebastian Erdweg JGU Mainz, André Pacak JGU Mainz
DOI
12:00
20m
Talk
Scaling Abstraction Refinement for Program Analyses in Datalog Using Graph Neural Networks
OOPSLA 2024
Zhenyu Yan Peking University, Xin Zhang Peking University, Peng Di Ant Group
DOI