Thu 24 Oct 2024 11:20 - 11:40 at IBR West - Datalog Chair(s): John Regehr

With its combination of Datalog, SMT solving, and functional programming, the language Formulog provides an appealing mix of features for implementing SMT-based static analyses (e.g., refinement type checking, symbolic execution) in a natural, declarative way. At the same time, the performance of its custom Datalog solver can be an impediment to using Formulog beyond prototyping—a common problem for Datalog variants that aspire to solve large problem instances. In this work we speed up Formulog evaluation, with some surprising results: while 2.2× speedups can be obtained by using the conventional techniques for high-performance Datalog (e.g., compilation, specialized data structures), the big wins come by abandoning the central assumption in modern performant Datalog engines, semi-naive Datalog evaluation. In the place of semi-naive evaluation, we develop eager evaluation, a concurrent Datalog evaluation algorithm that explores the logical inference space via a depth-first traversal order. In practice, eager evaluation leads to an advantageous distribution of Formulog’s SMT workload to external SMT solvers and improved SMT solving times: our eager evaluation extensions to the Formulog interpreter and Soufflé’s code generator achieve mean 5.2× and 7.6× speedups, respectively, over the optimized code generated by off-the-shelf Soufflé on SMT-heavy Formulog benchmarks.

All in all, using compilation and eager evaluation (as appropriate), Formulog implementations of refinement type checking, bottom-up pointer analysis, and symbolic execution achieve speedups on 20 out of 23 bench-marks over previously published, hand-tuned analyses written in F♯, Java, and C++, providing strong evidence that Formulog can be the basis of a realistic platform for SMT-based static analysis. Moreover, our experience adds nuance to the conventional wisdom that traditional semi-naive evaluation is the one-size-fits-all best Datalog evaluation algorithm for static analysis workloads.

Thu 24 Oct

Displayed time zone: Pacific Time (US & Canada) change

10:40 - 12:20
DatalogOOPSLA 2024 at IBR West
Chair(s): John Regehr University of Utah
10:40
20m
Talk
A Typed Multi-Level Datalog IR and its Compiler Framework
OOPSLA 2024
David Klopp JGU Mainz, Sebastian Erdweg JGU Mainz, André Pacak JGU Mainz
DOI
11:00
20m
Talk
Finding Cross-rule Optimization Bugs in Datalog Engines
OOPSLA 2024
Chi Zhang Nanjing University, Linzhang Wang Nanjing University, Manuel Rigger National University of Singapore
DOI
11:20
20m
Talk
Making Formulog Fast: An Argument for Unconventional Datalog EvaluationOOPSLA 2024 Distinguished Artifact Award
OOPSLA 2024
Aaron Bembenek University of Melbourne, Michael Greenberg Stevens Institute of Technology, Stephen Chong Harvard University
DOI Pre-print
11:40
20m
Talk
Object-Oriented Fixpoint Programming with Datalog
OOPSLA 2024
David Klopp JGU Mainz, Sebastian Erdweg JGU Mainz, André Pacak JGU Mainz
DOI
12:00
20m
Talk
Scaling Abstraction Refinement for Program Analyses in Datalog Using Graph Neural Networks
OOPSLA 2024
Zhenyu Yan Peking University, Xin Zhang Peking University, Peng Di Ant Group
DOI