ICSE 2025
Sat 26 April - Sun 4 May 2025 Ottawa, Ontario, Canada
Thu 1 May 2025 12:00 - 12:15 at 214 - AI for Testing and QA 3 Chair(s): Mike Papadakis

Many popular programming languages, including C#, Java, and Python, support exceptions. Exceptions are thrown during program execution if an unwanted event happens, e.g., a method is invoked with an illegal argument value. Software developers write exceptional behavior tests (EBTs) to check that their code detects unwanted events and throws appropriate exceptions. Prior research studies have shown the importance of EBTs, but those studies also highlighted that developers put most of their efforts on “happy paths”, e.g., paths without unwanted events. To help developers fill the gap, we present the first framework, dubbed EXLÓNG, that automatically generates EBTs. EXLÓNG is a large language model instruction-tuned from CodeLlama and embeds reasoning about traces that lead to throw statements, conditional expressions that guard throw statements, and non-exceptional behavior tests that execute similar traces. We compare EX LÓNG with the state-of-the-art models for test generation (CAT-LM) and one of the strongest foundation models (GPT3.5), as well as with analysis-based tools for test generation (Randoop and EvoSuite). Our results show that EXLÓNG outperforms existing models and tools. Furthermore, we contributed several pull requests to open-source projects and 23 EBTs generated by EXLÓNG were already accepted.

Thu 1 May

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30
AI for Testing and QA 3Research Track at 214
Chair(s): Mike Papadakis University of Luxembourg
11:00
15m
Talk
A Multi-Agent Approach for REST API Testing with Semantic Graphs and LLM-Driven Inputs
Research Track
Myeongsoo Kim Georgia Institute of Technology, Tyler Stennett Georgia Institute of Technology, Saurabh Sinha IBM Research, Alessandro Orso Georgia Institute of Technology
11:15
15m
Talk
ClozeMaster: Fuzzing Rust Compiler by Harnessing LLMs for Infilling Masked Real ProgramsArtifact-Available
Research Track
Hongyan Gao State Key Laboratory for Novel Software Technology, Nanjing University, Yibiao Yang Nanjing University, Maolin Sun Nanjing University, Jiangchang Wu State Key Laboratory for Novel Software Technology, Nanjing University, Yuming Zhou Nanjing University, Baowen Xu State Key Laboratory for Novel Software Technology, Nanjing University
11:30
15m
Talk
LLM Based Input Space Partitioning Testing for Library APIsArtifact-FunctionalArtifact-Available
Research Track
Jiageng Li Fudan University, Zhen Dong Fudan University, Chong Wang Nanyang Technological University, Haozhen You Fudan University, Cen Zhang Georgia Institute of Technology, Yang Liu Nanyang Technological University, Xin Peng Fudan University
11:45
15m
Talk
Leveraging Large Language Models for Enhancing the Understandability of Generated Unit TestsArtifact-FunctionalArtifact-Available
Research Track
Amirhossein Deljouyi Delft University of Technology, Roham Koohestani Delft University of Technology, Maliheh Izadi Delft University of Technology, Andy Zaidman TU Delft
DOI Pre-print
12:00
15m
Talk
exLong: Generating Exceptional Behavior Tests with Large Language ModelsArtifact-Available
Research Track
Jiyang Zhang University of Texas at Austin, Yu Liu Meta, Pengyu Nie University of Waterloo, Junyi Jessy Li University of Texas at Austin, USA, Milos Gligoric The University of Texas at Austin
12:15
15m
Talk
TOGLL: Correct and Strong Test Oracle Generation with LLMsArtifact-Available
Research Track
Soneya Binta Hossain University of Virginia, Matthew B Dwyer University of Virginia