Iterative-Epoch Online Cycle Elimination for Context-Free Language Reachability
Context-free language reachability (CFL-reachability) is a fundamental framework for implementing various static analyses. CFL-reachability utilizes context-free grammar (CFG) to extend the expressiveness of ordinary graph reachability from an unlabeled graph to an edge-labeled graph. Solving CFL-reachability requires a (sub)cubic time complexity with respect to the graph size, which limits its scalability in practice. Thus, an approach that can effectively reduce the graph size while maintaining the reachability result is highly desirable. Most of the existing graph simplification techniques for CFL-reachability work during the preprocessing stage, i.e., before the dynamic CFL-reachability solving process. However, in real-world CFL-reachability analyses, there is a large number of reducible nodes and edges that can only be discovered during dynamic solving, leaving significant room for on-the-fly improvements.
This paper aims to reduce the graph size of CFL-reachability dynamically via online cycle elimination. We propose a simple yet effective approach to detect collapsible cycles in the graph based on the input context-free grammar. Our key insight is that symbols with particular forms of production rules in the grammar are the essence of transitivity of reachability relations in the graph. Specifically, in the graph, a reachability relation to a node $v_i$ can be “transited” to another node $v_j$ if there is a transitive relation from $v_i$ to $v_j$, and cycles formed by transitive relations are collapsible. In this paper, we present an approach to identify the \textit{transitive symbols} in a context-free grammar, and propose an \textit{iterative-epoch} framework for online cycle elimination. From the perspective of non-parallelized CFL-reachability solving, our iterative-epoch framework is well compatible with both the standard (unordered) solver and the recent ordered solver, and can significantly improve their performance. Our experiment on context-sensitive value-flow analysis for C/C++ and field-sensitive alias analysis for Java demonstrates promising performance improvement by our iterative-epoch cycle elimination technique. By collapsing cycles online, our technique accelerates the standard solver by 17.17× and 13.94× for value-flow analysis and alias analysis, respectively, with memory reductions of 48.8% and 45.0%. Besides, our technique can also accelerates the ordered solver by 14.32× and 8.36× for value-flow analysis and alias analysis, respectively, with memory reductions of 55.2% and 57.8%.