Glossary
By Glue Team
Story point estimation is the practice of sizing software tasks using relative complexity scores rather than time-based estimates. One story point is an arbitrary unit (e.g., "the complexity of fixing a typo"); a 5-point story is 5x more complex; a 13-point story is considered too large and should be split.
Humans are bad at absolute time estimation but good at relative comparison. Ask "Will this take 5 hours or 10?" and engineers guess poorly. Ask "Is this twice as complex as last week's feature?" and they're often right.
Story points decouple effort from time, which is crucial because:
Points measure complexity; historical velocity converts points to predictable schedule.
Define a reference task: Pick a recent, simple task everyone worked on or reviewed. "That API endpoint fix is 3 points. Medium." Now all future estimates compare to this.
Use a power-of-2 scale: 1, 2, 3, 5, 8, 13, 21. (Skip 4, 6, 7, etc., to force crisp decisions.) Larger numbers compress differences: 20 vs. 21 points are indistinguishable, so the scale stops at 13.
Estimate relative to reference: "This feature is 2x as complex as the reference task, so 6 points" or "4x as complex, so 12 points. Actually, 12 is too big—we should split it."
Anything over 13 points must be split into smaller tasks. If you can't break it down, complexity is unclear and risk is high.
Team velocity: Track total points completed per sprint. If a team averages 35 points/sprint, that's their velocity. Next sprint with 35-point commitment is informed, not hoped.
| Aspect | Hours | Story Points |
|---|---|---|
| Accuracy | ±50% typical | ±30% typical |
| Who estimates | Individual engineers | Team discussion |
| What it measures | Time consumed | Complexity |
| Affected by | Interruptions, sleep, mood | Relative difficulty |
| Precision implied | High (misleading) | Honest uncertainty |
| Conversion | Direct (often wrong) | Via velocity (statistically sound) |
Sprint planning becomes:
Velocity-based planning is more reliable because it accounts for the team's actual capacity, including meetings, support work, and interruptions—not theoretical hours.
Q: If a task is 13 points, how do we split it? A: Break it by feature boundaries, not time. Instead of "5 points of this feature, 8 points of that," identify "Phase 1: core functionality (8 pts), Phase 2: integrations (5 pts)."
Q: What if estimators disagree sharply? A: Disagreement is data. "One engineer says 3 points, another says 13" means assumptions differ. Discuss until you converge or acknowledge risk: "We're 3-13 points uncertain, so let's size-cap at 5 and revisit mid-task."
Q: Should we track velocity per engineer or per team? A: Per team. Individual velocity varies based on task type, domain knowledge, and life circumstances. Team velocity smooths these variations and predicts capacity better.
Keep reading