LLM vs LLM – Fault Localization Approach

Overview

Fault localization is a critical and time-consuming step in software debugging, traditionally requiring extensive tests or instrumentation. This project proposes a novel approach in which two large language models (LLMs) form a self-improving pair inspired by generative adversarial setups.

Core Idea

Fault Injector LLM
Acts as a bug generator, producing C++ programs with synthetic faults.
Debugger LLM
Acts as a fault localizer, identifying the buggy code regions in the generated programs.

The interaction between these two models enables continuous data generation and iterative improvement without reliance on static, human-curated datasets.

Motivation

Existing fault localization datasets are limited in size and diversity. By using an LLM as a fault injector, the system can dynamically generate an unlimited supply of diverse and realistic bugs. This allows the debugger LLM to continuously fine-tune on new data, potentially improving localization accuracy over time.

Implementation

Bug Generation:
Implemented using the Gemma-3-12B model to generate 5,000 C++ programs with synthetic bugs.
Debugger Training:
A similar 12B-parameter model was fine-tuned on the synthetic corpus using parameter-efficient fine-tuning techniques (PEFT with QLoRA).

Results

Preliminary results indicate that the fine-tuned debugger shows modest improvements compared to its base model:

Training loss decreased.
Fault localization accuracy improved.

These results suggest the viability of the LLM-vs-LLM approach, even with limited computational resources.

Challenges

Several limitations prevented full realization of the iterative training loop:

Limited computing resources.
Long training times for large models.
Difficulty in ensuring the realism and diversity of injected faults.

Validation and Context

The approach is discussed and validated against existing background literature on fault localization, automated debugging, and LLM-based code analysis.

Future Work

Planned extensions include:

Incorporation of real-world bug benchmarks such as Defects4J and CodeNet.
Pipeline and training optimizations to reduce computational overhead.
A reinforced feedback loop between the fault injector and debugger to enable true self-improvement.

Conclusion

These findings suggest that, with adequate resources, an LLM-vs-LLM framework has the potential to significantly advance automated fault localization.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
archive		archive
prompts		prompts
.gitignore		.gitignore
Readme.md		Readme.md
Report.pdf		Report.pdf
gemma_finetune.py		gemma_finetune.py
gemma_generator.py		gemma_generator.py
generated_programs.csv		generated_programs.csv
generator-history.py		generator-history.py
generator.py		generator.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM vs LLM – Fault Localization Approach

Overview

Core Idea

Motivation

Implementation

Results

Challenges

Validation and Context

Future Work

Conclusion

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

darshjain/LLMvsLLM

Folders and files

Latest commit

History

Repository files navigation

LLM vs LLM – Fault Localization Approach

Overview

Core Idea

Motivation

Implementation

Results

Challenges

Validation and Context

Future Work

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages