Fault localization is a critical and time-consuming step in software debugging, traditionally requiring extensive tests or instrumentation. This project proposes a novel approach in which two large language models (LLMs) form a self-improving pair inspired by generative adversarial setups.
-
Fault Injector LLM
Acts as a bug generator, producing C++ programs with synthetic faults. -
Debugger LLM
Acts as a fault localizer, identifying the buggy code regions in the generated programs.
The interaction between these two models enables continuous data generation and iterative improvement without reliance on static, human-curated datasets.
Existing fault localization datasets are limited in size and diversity. By using an LLM as a fault injector, the system can dynamically generate an unlimited supply of diverse and realistic bugs. This allows the debugger LLM to continuously fine-tune on new data, potentially improving localization accuracy over time.
-
Bug Generation:
Implemented using the Gemma-3-12B model to generate 5,000 C++ programs with synthetic bugs. -
Debugger Training:
A similar 12B-parameter model was fine-tuned on the synthetic corpus using parameter-efficient fine-tuning techniques (PEFT with QLoRA).
Preliminary results indicate that the fine-tuned debugger shows modest improvements compared to its base model:
- Training loss decreased.
- Fault localization accuracy improved.
These results suggest the viability of the LLM-vs-LLM approach, even with limited computational resources.
Several limitations prevented full realization of the iterative training loop:
- Limited computing resources.
- Long training times for large models.
- Difficulty in ensuring the realism and diversity of injected faults.
The approach is discussed and validated against existing background literature on fault localization, automated debugging, and LLM-based code analysis.
Planned extensions include:
- Incorporation of real-world bug benchmarks such as Defects4J and CodeNet.
- Pipeline and training optimizations to reduce computational overhead.
- A reinforced feedback loop between the fault injector and debugger to enable true self-improvement.
These findings suggest that, with adequate resources, an LLM-vs-LLM framework has the potential to significantly advance automated fault localization.