Evaluating the effectiveness of Deep Learning Models for Foundational Program Analysis Tasks (SPLASH 2024 - OOPSLA 2024)

Sun 20 - Fri 25 October 2024 Pasadena, California, United States

Who

Qian Chen, Chenyang Yu, Ruyan Liu, Chi Zhang, Yu Wang, Ke Wang, Ting Su, Linzhang Wang

Track

SPLASH 2024 OOPSLA

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Thu 24 Oct 2024 14:00 - 14:20 at IBR East - Machine Learning and Programming Languages Chair(s): Loris D'Antoni

Abstract

While deep neural networks provide state-of-the-art solutions to a wide range of programming language tasks, their effectiveness in dealing with foundational program analysis tasks remains an open question. In this paper, we present an empirical study that evaluates the prominent models of code (i.e., CuBERT, CodeBERT, GGNN, and Graph Sandwiches) in two such foundational tasks: (1) alias prediction, in which models predict whether two pointers must alias, may alias or must not alias; and (2) equivalence prediction, in which models predict whether or not two programs are semantically equivalent. At the core of this study is CodeSem, a dataset built upon the source code of real-world flagship software (e.g., Linux Kernel, GCC, MySQL) and manually validated for the two prediction tasks. Results show that all models are accurate in both prediction tasks, especially CuBERT with an accuracy of 89% and 84% in alias prediction and equivalence prediction, respectively. We also conduct a comprehensive, in-depth analysis of the results of all models in both tasks, concluding that deep learning models are generally capable of performing foundational tasks in program analysis even though in specific cases their weaknesses are also evident.

Our code and evaluation data are publicly available at https://github.com/CodeSemDataset/CodeSem.

DOI

https://doi.org/10.1145/3649829

Qian Chen

Nanjing University

China

Chenyang Yu

Department of Computer Science and Technology, Nanjing University

Ruyan Liu

Department of Computer Science and Technology, Nanjing University

Chi Zhang

Nanjing University

China

Yu Wang

Nanjing University

China

Ke Wang

United States

Ting Su

East China Normal University

China

Linzhang Wang

Nanjing University

Time Zone

The program is currently displayed in (GMT-07:00) Pacific Time (US & Canada).

Use conference time zone: (GMT-07:00) Pacific Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Thu 24 Oct
Displayed time zone: Pacific Time (US & Canada) change

13:40 - 15:20	Machine Learning and Programming LanguagesOOPSLA 2024 at IBR East Chair(s): Loris D'Antoni UCSD

13:40 20m Talk		CYCLE: Learning to Self-Refine the Code Generation OOPSLA 2024 Yangruibo Ding Columbia University, Marcus J. Min Columbia University, Gail Kaiser Columbia University, Baishakhi Ray Columbia University, New York; AWS AI Lab DOI
14:00 20m Talk		Evaluating the effectiveness of Deep Learning Models for Foundational Program Analysis Tasks OOPSLA 2024 Qian Chen Nanjing University, Chenyang Yu Department of Computer Science and Technology, Nanjing University, Ruyan Liu Department of Computer Science and Technology, Nanjing University, Chi Zhang Nanjing University, Yu Wang Nanjing University, Ke Wang , Ting Su East China Normal University, Linzhang Wang Nanjing University DOI
14:20 20m Talk		Knowledge Transfer from High-Resource to Low-Resource Programming Languages for Code LLMs OOPSLA 2024 Federico Cassano Northeastern University, John Gouwar Northeastern University, Francesca Lucchetti Northeastern University, Claire Schlesinger Northeastern University, Anders Freeman Wellesley College, Carolyn Jane Anderson Wellesley College, Molly Q Feldman Oberlin College, Michael Greenberg Stevens Institute of Technology, Abhinav Jangda Microsoft Research, Arjun Guha Northeastern University; Roblox DOI Pre-print
14:40 20m Talk		Statically Contextualizing Large Language Models with Typed Holes OOPSLA 2024 Andrew Blinn University of Michigan, Xiang Li University of Michigan, Ann Arbor, June Hyung Kim University of Michigan, Cyrus Omar University of Michigan DOI
15:00 20m Talk		WhiteFox: White-box Compiler Fuzzing Empowered by Large Language Models OOPSLA 2024 Chenyuan Yang University of Illinois at Urbana-Champaign, Yinlin Deng University of Illinois at Urbana-Champaign, Runyu Lu Huazhong University of Science and Technology, Jiayi Yao The Chinese University of Hong Kong, Shenzhen, Jiawei Liu University of Illinois at Urbana-Champaign, Reyhaneh Jabbarvand University of Illinois at Urbana-Champaign, Lingming Zhang University of Illinois at Urbana-Champaign DOI