Fast and Optimal Extraction for Sparse Equality GraphsOOPSLA 2024 Distinguished Paper Award
Equality graphs (e-graphs) are used to compactly represent equivalence classes of terms in symbolic reasoning systems. Beyond their original roots in automated theorem proving, e-graphs have been used in a variety of applications. They have become particularly important as the key ingredient in the popular technique of equality saturation, which has notable applications in compiler optimization, program synthesis, program verification, and symbolic execution, among others. In a typical equality saturation workflow, an e-graph is used to store a large number of equalities that are generated by local rewrites during a saturation phase, after which an optimal term is extracted from the e-graph as the output of the technique. However, despite its crucial role in equality saturation, e-graph extraction has received relatively little attention in the literature, which we seek to start addressing in this paper. Extraction is a challenging problem and is notably known to be NP-hard in general, so current equality saturation tools rely either on slow optimal extraction algorithms based on integer linear programming (ILP) or on heuristics that may not always produce the optimal result. In fact, in this paper, we show that e-graph extraction is hard to approximate within any constant ratio. Thus, any such heuristic will produce wildly suboptimal results in the worst case. Fortunately, we show that the problem becomes tractable when the e-graph is sparse, which is the case in many practical applications. We present a novel parameterized algorithm for extracting optimal terms from e-graphs with low treewidth, a measure of how “tree-like” a graph is, and prove its correctness. We also present an efficient Rust implementation of our algorithm and evaluate it against ILP on a number of benchmarks extracted from the Cranelift benchmark suite, a real-world compiler optimization library based on equality saturation. Our algorithm optimally extracts e-graphs with treewidths of up to 10 in a fraction of the time taken by ILP. These results suggest that our algorithm can be a valuable tool for equality saturation users who need to extract optimal terms from sparse e-graphs.