By Arjun Mehta
Story points are a cargo cult metric.
Teams adopt story points because agile methodology says to. They spend hours debating whether a feature is 5 points or 8 points. They calculate velocity. They forecast sprints.
Then reality hits. Estimates are wrong. Velocity varies wildly. Forecasts are fiction.
The problem isn't implementation. Story points are fundamentally flawed.
Why Story Points Don't Work
They're estimating effort, not time. A 5-point task might take one engineer 3 days or another engineer 5 days depending on experience. They might take 2 days if the engineer is fully focused, or 1 week if there are interruptions. Points are effort, but effort is relative, contextual, and unstable.
Estimates are guesses masquerading as numbers. You estimate a feature at 8 points because it feels like 8. Not because you've analyzed it rigorously. You ask five engineers, they say 8, 5, 13, 8, 8. You average to 8. Now it's official-looking. But it's still a guess.
Estimating encourages fake precision. If you estimated 8 points and it took 9 points worth of time, people feel like something went wrong. You're off by 12.5%. In reality, 12.5% variance is excellent. But because the number is precise, variance feels wrong.
Velocity is a lag metric. Velocity measures what you completed last sprint. It doesn't tell you why. It doesn't tell you what's blocking you now. It doesn't help you predict next sprint (because estimates are guesses).
Story points become political. Teams game the system. Engineers know management wants higher velocity, so estimates get generous. Now "5 points" means different things to different teams. The system becomes meaningless for planning.
The Real Problems Story Points Try to Solve
1. How much work can we commit to? You want to know: how many features can we ship in a sprint? Story points try to answer this. But the answer depends on: how hard are the features? How many interruptions will we have? How experienced is the team?
Story points don't answer these questions. They just create the illusion of an answer.
2. Are we getting faster or slower? You want to track: is our team's productivity increasing or decreasing? Story points try to answer this with velocity. But velocity is measuring the wrong thing. Maybe velocity went up because estimates got generous. Maybe it went down because we took on more interruptions. Maybe it stayed flat because one senior engineer left and one joined.
3. How long will Project X take? You want to know: we have 100 features, each is 5 points, we have velocity of 40 points per sprint. So 2.5 sprints. Story points try to answer this. But the answer is fiction if estimates are wrong and velocity varies.
What Actually Measures Productivity
Shipped value. How much value did you deliver? This is hard to measure. But it's the right thing to measure. Did you ship a feature customers want? Did you fix a critical bug? Did you improve reliability? That's what matters.
Story points measure estimated effort. Value is the only metric that matters for business.
Cycle time. How long does it take from "work starts" to "work ships"? If your cycle time is increasing, something's wrong. Probably technical debt or knowledge silos. If it's decreasing, you're improving.
Cycle time is a real metric. It measures actual time, not estimated effort.
Lead time. How long from "customer requests feature" to "feature ships"? This combines planning, prioritization, and execution. Shorter is better.
Lead time is useful for PMs. It shows: how fast can we respond to customer needs?
Defect rates. How many bugs ship per feature? If defect rates are increasing, quality is suffering. If they're decreasing, you're doing something right.
Defect rates are a real quality metric.
What To Do Instead of Story Points
1. Estimate in Time, Not Points
Instead of "8 points," estimate "4-5 days." It's more honest. Everyone understands days. Points are a proxy that adds confusion.
Estimating in days doesn't make estimates more accurate. But it makes them more realistic and easier to understand.
2. Track Cycle Time
When a task ships, measure: how long from start to ship? Track the average. If it's increasing, investigate why. Increasing cycle time is a signal something's wrong.
Cycle time is a leading indicator. Story points are a trailing indicator that tells you nothing.
3. Have a Capacity-Based Plan
Instead of estimating points and checking velocity:
- Estimate tasks in days
- Calculate available capacity (team size * 5 days * availability)
- Plan for that capacity
If you have 10 engineers, 80% available (20% interruptions/meetings), you have 40 engineer-days per week.
If your tasks average 3 days, you can commit to ~13 tasks per week.
That's real planning. Not points-based forecasting.
4. Measure What Matters
Instead of velocity:
- How many features shipped?
- What's the cycle time?
- What's the defect rate?
- How many bugs in production?
These metrics tell you if your team is healthy. Velocity tells you if your estimates were consistent.
Why This Matters
When you use story points, you're measuring the wrong thing. You optimize for point velocity instead of shipping value. You game estimates instead of being realistic. You forecast sprints with false confidence.
When you use real metrics (cycle time, value, defect rate), you optimize for what matters. You ship faster. You improve quality. You're honest about what you can do.
The difference is small. Just stop using story points.
With Glue
Instead of guessing how long features will take, with Glue you can understand your technical debt, knowledge distribution, and architectural constraints. Better understanding means better time estimates. Not because story points are better. But because you understand what you're actually building.
Frequently Asked Questions
Won't removing story points break our planning? Your planning is already broken if it's based on story points. You're just not seeing it. Replace story points with time estimates and capacity planning. Your planning will be more honest and more accurate.
What if leadership demands velocity metrics? Then show them something real. "Shipped 8 features, average cycle time 6 days, defect rate 2%," is more meaningful than "velocity of 42 points." Leadership cares about output, not estimates.
Isn't time estimation even more guesswork than points? Yes. But at least it's honest guesswork. Days are real. Everyone understands them. Points are abstract and create false precision.