Issue-to-Files Fine-Tune: Training LLMs to Map Bug Reports to Code Changes

View on GitHub

Pipeline Architecture

The Problem

When a bug report arrives, engineers spend significant time just figuring out which files are relevant before they can write a single line of fix.

NEO built Issue-to-Files Fine-Tune to train a code model that reads a GitHub issue and outputs a ranked list of files most likely to require changes.

Training Data Extraction

Issue-to-Files Fine-Tune builds its dataset by mining the closed issue and pull request history of any GitHub repository. For each closed issue that has a linked PR, the pipeline extracts the issue body as the input and the set of files touched in the PR diff as the ground-truth label.

The extractor uses the GitHub REST API to paginate through merged PRs, resolves linked issues via closing keywords (Fixes #, Closes #, Resolves #), and fetches the file-level diff summary using the /pulls/{pull_number}/files endpoint. Each training example is stored as a structured record:

{
  "issue_body": "NullPointerException in UserService when email is null...",
  "repo": "org/repo",
  "changed_files": [
    "src/services/UserService.java",
    "src/models/User.java"
  ]
}

Repositories with fewer than 50 matched issue-PR pairs are skipped. The pipeline applies deduplication by issue ID and filters out issues where the linked PR touches more than 30 files (refactors or mass renames that aren't representative of targeted bug fixes).

Fine-Tuning CodeLlama and Qwen-Coder

The model is fine-tuned using a sequence-to-sequence objective where the input is the issue body (truncated to 512 tokens) and the output is a newline-delimited ranked list of file paths. Both CodeLlama-7B-Instruct and Qwen2.5-Coder-7B-Instruct are supported as base models.

Training uses QLoRA with 4-bit quantization to fit on a single A100-40GB:

python train.py \
  --model qwen2.5-coder-7b-instruct \
  --data ./data/training_pairs.jsonl \
  --lora_r 16 \
  --lora_alpha 32 \
  --epochs 3 \
  --batch_size 8

The prompt template wraps the issue body in a structured instruction:

You are a code navigation assistant. Given the following GitHub issue, list the files most likely to need modification, one per line, ranked by relevance.

Issue:
{issue_body}

Files:

Evaluation with File-Recall@K

The evaluation harness holds out 20% of issue-PR pairs and measures file-recall@k the fraction of ground-truth changed files that appear in the model's top-k predictions. Results are computed at k=1, k=3, and k=5.

ModelRecall@1Recall@3Recall@5
CodeLlama-7B (baseline)0.310.540.67
Qwen2.5-Coder-7B (baseline)0.380.610.73
Qwen2.5-Coder-7B (fine-tuned)0.570.790.88

Fine-tuning on repo-specific history improves recall@3 by 18 percentage points over the zero-shot baseline giving coding agents a much tighter initial search space when scoping a fix.

How to Build This with NEO

Open NEO in VS Code or Cursor and describe what you want to build. A good starting prompt for this project:

"Build a fine-tuning pipeline that extracts GitHub issue and pull request pairs from a repo, formats them as training data where the input is the issue body and the output is a ranked list of changed files, then fine-tunes a Qwen2.5-Coder model using QLoRA and evaluates with file-recall@k metrics."

Build with NEO →

NEO generates the project structure and core implementation. From there you iterate ask it to add support for additional base models, improve the prompt template for specific languages, or build out a REST API so your coding agent can query predictions at runtime. Each request builds on what's already there.

To run the finished project:

git clone https://github.com/dakshjain-1616/issue-to-files-finetune
cd issue-to-files-finetune
pip install -r requirements.txt
python extract_data.py --repo org/repo --token $GITHUB_TOKEN
python train.py --model qwen2.5-coder-7b-instruct --data ./data/training_pairs.jsonl
python evaluate.py --model ./checkpoints/final

Point the trained model at any new issue and get a ranked file list back in under a second.

NEO built a GitHub-powered fine-tuning pipeline that teaches a code model to scope bug fixes by predicting affected files from issue descriptions. See what else NEO ships at heyneo.com.

Try NEO in Your IDE

Install the NEO extension to bring AI-powered development directly into your workflow: