“Who made this bug?” shouldn’t be a blame game. It should be an automated answer.

When a regression pops up during testing, the usual process is tedious: dig through Git history, read commit messages, manually link tickets. We often skip it because it’s painful — but that’s valuable data left on the table.

So I automated it.

The Problem

At Paycor, when QA finds a bug, the natural question is: which feature introduced this regression? The answer usually lives somewhere in the Git history, but finding it means:

Identifying which files are affected
Running `git blame` on those files
Correlating commits to user stories
Manually linking the bug to the story

This takes anywhere from 15 minutes to an hour depending on complexity. Multiply that by dozens of bugs per sprint, and you’re looking at serious lost time.

The Solution

I built a pipeline that connects Azure DevOps with OpenAI to do the heavy lifting:

Step 1: Detect the Fix

The system watches for closed bugs and identifies the commit that fixed them. This is straightforward — most teams already link commits to work items.

Step 2: Analyze the Context

Here’s where it gets interesting. The system:

Looks at the files changed in the fix
Pulls the commit history for those specific lines using `git blame`
Extracts the recent commit history for context

```bash

The core insight: git blame tells you who changed what

git blame -L 45,60 src/components/PaymentForm.tsx ```

Step 3: AI as the Judge

The extracted context goes to GPT-4o with a focused prompt:

“Given this bug fix that changed lines X-Y in these files, and the commit history showing who previously modified these lines, which commit is the most likely root cause?”

The model doesn’t guess randomly. It has:

The exact lines that were fixed
The history of changes to those lines
Commit messages providing context

Step 4: Auto-Link

Once GPT-4o identifies the likely root cause commit, the system:

Finds the User Story associated with that commit
Automatically links it to the Bug in Azure DevOps
Adds a comment explaining the connection

The Results

Before: Manual triage taking 15-60 minutes per bug, often skipped entirely.

After: Complete traceability in seconds, zero manual effort.

We now know exactly which features are causing regressions. Sprint retrospectives have actual data instead of hunches. The team can identify patterns — maybe a particular area of the codebase needs refactoring, or a specific type of change tends to break things.

Why This Works

The key insight is that this process mirrors what an engineer would do manually. We’re not asking the AI to magically know the answer — we’re giving it the same information a human would use:

What was broken (the bug)
What fixed it (the commit)
What changed before (git blame history)

The AI just does the correlation faster and more consistently.

Limitations

This approach works best when:

Your team consistently links commits to work items
Bug fixes are relatively isolated (not refactors touching 50 files)
You have a reasonably clean Git history

It’s less effective for:

Bugs caused by the interaction of multiple changes
Issues in dependencies or infrastructure
Very old bugs where the history is murky

What’s Next

I’m exploring ways to extend this:

Predictive alerts: Flag high-risk PRs based on historical bug patterns
Team routing: Auto-assign bugs to the developer most familiar with the affected code
Pattern detection: Identify which types of changes correlate with bugs

At this rate, I’ll automate myself out of a job in 18 months. But that’s the goal, right? Spend less time on toil, more time on interesting problems.

The code isn’t public (it’s tightly coupled to our internal systems), but the pattern is reusable. If you’re dealing with similar pain points, the combination of `git blame` + LLM reasoning is surprisingly effective.