MuLLSA Mutation with LLM and Static Analysis

Défense de mémoire d'Arthur Barbieux

Catégorie : mémoire
Date : 29/08/2025 11:00 - 29/08/2025 12:30
Lieu : Salle Académique
Orateur(s) : Arthur Barbieux
Organisateur(s) : Isabelle Daelman

Looking for bugs and vulnerabilities is one of the most important tasks in computer science, particularly in the context of web applications. Many techniques exist to detect and prevent such issues, and one widely used method is mutation

testing—a process where small changes (mutants) are introduced into code to evaluate the effectiveness of testing tools or security mechanisms.

The automatic creation of mutants has been studied and used for years in the field of software testing, especially to assess the robustness of analysis tools. Traditionally, however, creating meaningful mutants that simulate real-world vulnerabilities remains a challenge, especially when done manually—it is both timeconsuming and error-prone. To address this, we propose a new approach that combines static analysis with

a Large Language Model (LLM) to automatically generate mutants. In this study, we compare the performance of an LLM in producing mutants based on three different static analysis tools: KAVe, WAP, and the LLM itself.

Our results show significant variability between tools. Mutants where the production is based on traditional static analysis vary heavily depending on the type of vulnerability, and tend to perform better when tools are combined.

When it comes to the LLM, the quality of mutants is more consistent across different vulnerabilities, and the overall code coverage is significantly higher than traditional approaches.

In our evaluation, we found that many of the generated mutants were not executable, mainly due to syntax or semantic errors—especially in those created by traditional static analysis tools. The LLM-based approach produced significantly more executable mutants, showing better preservation of code structure.

However, this has as a consequence that there are a higher number of equivalent mutants which behave in the same way as the original code and offer little value

for mutation tests.

These findings suggest that LLMs are a promising addition to automated vulnerability testing workflows, especially when used in conjunction with static analysis tools. However, further refinement is needed to reduce the generation of

incorrect or equivalent mutants and to better align with real-world usability.

Contact : Isabelle Daelman - isabelle.daelman@unamur.be
Télecharger : vCal

Sections

MuLLSA Mutation with LLM and Static Analysis