Why Engineering Metrics Often Backfire (and What to Do Instead)

By Stephen Ledwith July 15, 2025

We’ve all seen it.

The sprint burndown chart looks great. Every story is marked “done.” The team hit 40 story points for the third week in a row. And yet… nothing meaningful shipped. The product owner is frustrated. The engineers are burned out. Leadership is asking for “more velocity.” And nobody really knows what that means anymore.

This is the paradox of engineering metrics. In theory, metrics bring clarity. In practice, they often distort behavior. But that doesn’t mean we should stop measuring—just that we need to measure better.

Measurement Drives Behaviour
“Tell me how you measure me, and I will tell you how I behave.”
— Eliyahu Goldratt

Let’s start with what goes wrong.

Velocity, Story Points & Commit Counts: The Usual Suspects

Velocity was never meant to be a KPI. It was a planning tool—an internal team estimate of throughput over time. But somewhere along the way, it got turned into a performance metric. And the moment that happened, teams started gaming it. Stories got smaller. Estimates got inflated. Spikes got counted. And ironically, velocity dropped as trust eroded.

It’s not just velocity. I’ve seen leadership teams obsess over commit counts, PR volume, or lines of code. These might reflect activity, but rarely impact. An engineer who deletes 3,000 lines of code is often doing far more valuable work than one who adds 30,000. But the metrics don’t show that nuance.

If you want to measure output, commit counts won’t help you. If you want to measure productivity, story points won’t save you. And if you want to understand delivery health, you need to go beyond dashboards.

🎧 In Episode 7 of The Architect and The Executive, the hosts touch on this with a chuckle:

“You can hit your velocity every sprint and still ship garbage. Or worse, nothing.”

The Real Problem: Misaligned Incentives

Most bad metrics are symptoms of misaligned goals. When leadership wants predictability, product wants flexibility, and engineers want stability, the result is a tug-of-war over process.

Instead of pushing teams to “do more,” start asking: What’s blocking us from delivering better?

This is where frameworks like SPACE come in. Created by researchers at GitHub, Microsoft, and the University of Victoria, SPACE stands for:

Satisfaction and well-being
Performance
Activity
Collaboration and communication
Efficiency and flow

You don’t need to measure all of them—but balancing metrics across multiple categories gives you a clearer picture of team health than any single dashboard ever will.

📖 Read the full SPACE paper here (ACM Queue, 2021)

🐛 Bonus Bounties: The Great Bug-Fix Incentive Debacle

Here’s one of my favorite cautionary tales from the trenches.

A well-meaning manager once tried to boost quality by offering cash bonuses for every bug fixed. Seems logical, right? Fewer bugs, happier customers.

But the plan quickly backfired.

Suddenly, the bug tracker was full of suspiciously low-effort tickets. Minor formatting issues, duplicate entries, non-reproducible edge cases—engineers were “fixing” bugs faster than QA could triage them. Rumors even circulated that some were planting tiny bugs just to fix them later.

Quality didn’t improve. Trust cratered. The bonus program was shut down in a few months.

The moral? Incentivize behavior, not manipulation. If your metrics can be gamed, they will be.

Measure Twice - Incentivize Once

So What Should You Measure?

Measure what matters—to your team. Some teams need to focus on flow. Others on quality. Others on collaboration. Don’t adopt someone else’s KPIs because they look impressive in a slide deck.

Here’s a simple rule I use: If the metric can be gamed without helping the customer, it’s probably not useful.

That doesn’t mean you throw out all data. It means you treat data as one part of the puzzle. Teams are complex systems. Measuring them like they’re factories misses the point.

As leaders, our job isn’t just to collect metrics. It’s to interpret them in context. It’s to ask questions like: “Why did this sprint go off the rails?” or “What’s making this team consistently overperform?” and then actually listen to the answers.

🎧 The Architect and The Executive hit this again in Episode 5:

“Good metrics don’t control people. They empower them to make better decisions.”

Couldn’t agree more.

Final Thought

Engineering metrics aren’t evil. But they’re not magic, either. They’re tools. And like all tools, they can be misused.

So next time you’re staring at a burndown chart wondering what it really means, take a breath. Ask better questions. Focus on outcomes. And remember that behind every number is a human—and usually, a much better story.

“The best metric is trust. Everything else is just calibration.”
— Stephen Ledwith

Citations

The Architect and The Executive, Episode 5, “What Metrics Matter?”
The Architect and The Executive, Episode 7, “You Delivered What?”
Nicole Forsgren et al., “The SPACE of Developer Productivity,” ACM Queue, March 2021 – https://queue.acm.org/detail.cfm?id=3454124
Accelerate, by Nicole Forsgren, Jez Humble, and Gene Kim, IT Revolution Press (2018)