Sound and partially-complete static analysis of data-races in GPU programs
GPUs are progressively being integrated into modern society, playing a pivotal role in Artificial Intelligence and High-Performance Computing. Programmers need a deep understanding of the GPU programming model to avoid subtle data-races in their codes. Static verification that is sound and incomplete can guarantee data-race freedom, but the alarms it raises may be spurious and need to be validated.
In this paper, we establish a True Positive Theorem for a static data-race detector for GPU programs, i.e., a result that identifies a class of programs for which our technique only raises true alarms. Our work builds on the formalism of memory access protocols, that models the concurrency operations of CUDA programs. The crux of our approach is an approximation analysis that can correctly identify true alarms, and pinpoint the conditions that make an alarm imprecise. Our approximation analysis detects when the reported locations are reachable (control independence, or CI), and when the reported locations are precise (data independence, or DI), as well identify inexact values in an alarm (Theorem 4.3). In addition to a True Positive result for programs that are CI and DI (Theorem 4.5), we establish the root causes of spurious alarms depending on whether CI or DI are present (Corollary 4.4).
We apply our theory to introduce FaialAA, the first sound and partially complete data-race detector. We evaluate FaialAA in three experiments. First, in a comparative study with the state-of-the-art tools, we show that FaialAA confirms more DRF programs than others while emitting 1.9× fewer potential alarms. Importantly, the approximation analysis of FaialAA detects 10 undocumented data-races. Second, in an experiment studying 6 commits of data-race fixes in open source projects OpenMM and Nvidia’s MegaTron, FaialAA confirmed the buggy and fixed versions of 5 commits, while others were only able to confirm 2. Third, we show that 59.5% of 2,770 programs are CI and DI, quantifying when the approximation analysis of FaialAA is complete.
This paper is accompanied by the mechanized proofs of the theoretical results presented therein and a tool (FaialAA) implementing of our theory.
Wed 23 OctDisplayed time zone: Pacific Time (US & Canada) change
10:40 - 12:20 | Static Analysis and Program Verification 1OOPSLA 2024 at IBR East Chair(s): Di Wang Peking University | ||
10:40 20mTalk | Cocoon: Static Information Flow Control in Rust OOPSLA 2024 Ada Lamba Ohio State University, Max Taylor Ohio State University, Vincent Beardsley Ohio State University, Jacob Bambeck Ohio State University, Michael D. Bond Ohio State University, Zhiqiang Lin The Ohio State University DOI | ||
11:00 20mTalk | Computing Precise Control Interface Specifications OOPSLA 2024 Eric Campbell Cornell University, Hossein Hojjat Tehran Institute for Advanced Studies (TeIAS), Nate Foster Cornell University and Jane Street DOI | ||
11:20 20mTalk | FlowCert: Translation Validation for Asynchronous Dataflow Programs via Dynamic Fractional Permissions OOPSLA 2024 Zhengyao Lin Carnegie Mellon University, Joshua Gancher Northeastern University, Bryan Parno Carnegie Mellon University DOI | ||
11:40 20mTalk | ParDiff: Practical Static Differential Analysis of Network Protocol ParsersOOPSLA 2024 Distinguished Paper Award OOPSLA 2024 Mingwei Zheng Purdue University, Qingkai Shi Nanjing University, Xuwei Liu Purdue University, USA, Xiangzhe Xu Purdue University, Le Yu , Congyu Liu Purdue University, Guannan Wei Inria/ENS; Tufts University, Xiangyu Zhang Purdue University DOI | ||
12:00 20mTalk | Sound and partially-complete static analysis of data-races in GPU programs OOPSLA 2024 Dennis Liew University of Massachusetts Boston, Tiago Cogumbreiro University of Massachusetts Boston, Julien Lange Royal Holloway, University of London DOI |