Reward Augmentation in Reinforcement Learning for Testing Distributed Systems
Bugs in popular distributed protocol implementations have been the source of many downtimes in popular internet services. We describe a randomized testing approach for distributed protocol implementations based on reinforcement learning. Since the natural reward structure is very sparse, the key to successful exploration in reinforcement learning is reward augmentation. We show two different techniques that build on one another. First, we provide a decaying exploration bonus based on the discovery of new states—the reward decays as the same state is visited multiple times. The exploration bonus captures the intuition from coverage-guided fuzzing of prioritizing new coverage points; in contrast to other schemes, we show that taking the maximum of the bonus and the Q-value leads to more effective exploration. Second, we provide waypoints to the algorithm as a sequence of predicates that capture interesting semantic scenarios. Waypoints exploit designer insight about the protocol and guide the exploration to “interesting” parts of the state space. Our reward structure ensures that new episodes can reliably get to deep interesting states even without execution caching. We have implemented our algorithm in Go. Our evaluation on three large benchmarks (RedisRaft, Etcd, and RSL) shows that our algorithm can significantly outperform baseline approaches in terms of coverage and bug finding
Fri 25 OctDisplayed time zone: Pacific Time (US & Canada) change
16:00 - 17:40 | Testing Everything, Everywhere, All At OnceOOPSLA 2024 at IBR East Chair(s): Alex Potanin Australian National University | ||
16:00 20mTalk | Crabtree: Rust API Test Synthesis Guided by Coverage and Type OOPSLA 2024 Yoshiki Takashima Carnegie Mellon University, Chanhee Cho Carnegie Mellon University, Ruben Martins Carnegie Mellon University, Limin Jia , Corina S. Păsăreanu Carnegie Mellon University; NASA Ames DOI | ||
16:20 20mTalk | Drowzee: Metamorphic Testing for Fact-conflicting Hallucination Detection in Large Language Models OOPSLA 2024 Ningke Li Huazhong University of Science and Technology, Yuekang Li UNSW, Yi Liu Nanyang Technological University, Ling Shi Nanyang Technological University, Kailong Wang Huazhong University of Science and Technology, Haoyu Wang Huazhong University of Science and Technology DOI | ||
16:40 20mTalk | Reward Augmentation in Reinforcement Learning for Testing Distributed Systems OOPSLA 2024 Andrea Borgarelli Max Planck Institute for Software Systems, Constantin Enea LIX, CNRS, Ecole Polytechnique, Rupak Majumdar MPI-SWS, Srinidhi Nagendra CNRS, Université Paris Cité, IRIF, Chennai Mathematical Institute DOI | ||
17:00 20mTalk | Rustlantis: Randomized Differential Testing of the Rust Compiler OOPSLA 2024 DOI | ||
17:20 20mTalk | Statistical Testing of Quantum Programs via Fixed-Point Amplitude Amplification OOPSLA 2024 DOI |