Technical Debt: The Complete Guide
Ward Cunningham, the inventor of the term "technical debt," was talking about something specific: intentional shortcuts taken to learn faster, with the explicit promise to pay back later. You ship a feature with temporary code because you want user feedback. You get feedback. You refactor into the right design. Debt paid.
What happened is that "technical debt" became a catch-all for "code that's messy." That's wrong. Messy code is a problem, but calling it debt misses something important: debt has a borrower and a lender. Who benefits from the debt, and who pays interest?
Types of Technical Debt
Martin Fowler created a useful framework. Debt is two dimensional:
The first axis is deliberate or inadvertent:
- Deliberate: "We're shipping this with a hack because we need to learn what users want."
- Inadvertent: "We didn't realize this design was going to create coupling."
The second axis is reckless or prudent:
- Reckless: "We don't have time to think about this."
- Prudent: "We've thought it through and we're making a conscious choice."
This gives you a 2x2 matrix:
Deliberate and prudent debt: "We're using a simple solution because we don't know the requirements yet. Once we understand the problem, we'll refactor." This is good debt. You intentionally take it. You explicitly plan to pay it back.
Deliberate and reckless debt: "We're shipping this badly written code because we're on a deadline." This is bad. You're knowingly shipping quality issues with no plan to fix them.
Inadvertent and prudent debt: "We didn't realize this would couple the systems, but now that we see it, here's our plan to decouple." You discover the debt, acknowledge it, and plan to pay it. This is fine.
Inadvertent and reckless debt: "We didn't realize this was a problem, and we don't care." This is the worst. Debt you don't even acknowledge. This is where your codebase rots.
Most codebases are full of quadrant four.
Measuring Technical Debt
You can't manage what you don't measure. The problem is that technical debt is hard to quantify. You can't point to code and say "this debt costs us $10k per year." But you can measure the consequences.
Complexity metrics: Cyclomatic complexity measures how many code paths exist. High complexity means more bugs and slower development. Tools like SonarQube measure this. Useful signal but not the whole picture. Code that's complex but well-tested is different from code that's complex and untested.
Change failure rate: What percentage of your deploys cause an incident? If 30% of your deploys break something, you have debt. If 3%, you don't. This is a lagging indicator - debt shows up as higher failure rates.
Time-to-change velocity: How long does it take to implement a feature in different parts of the codebase? If a feature takes one sprint in one service and three sprints in another, the slower service has debt slowing you down.
Business impact metrics: Did debt cause you to miss a roadmap item? Did debt cause an outage that lost customers? That's the cost of debt. This is the most important metric but the hardest to measure.
The best approach: measure both technical signals (complexity, coverage) and business signals (change velocity, incident rate, missed roadmap items). When business signals are bad, audit the code to find the technical causes.
How to Prioritize Debt Paydown
Not all debt is created equal. Paydown should follow this priority:
- Debt that's blocking the roadmap. If a debt blocks a feature you promised, pay it down. This has clear business value.
- Debt in high-churn areas. Code that changes frequently is read frequently. Reducing complexity here has multiplier effects.
- Debt that causes incidents. If a piece of code keeps causing outages, fix it.
- Debt that makes hiring hard. If your codebase is so complex that engineers take 6 months to be productive, that's affecting hiring and retention.
Don't prioritize:
- Debt in stable code nobody touches. If a module is complex but hasn't changed in two years, leave it alone. Refactoring code you don't touch is waste.
- Debt you're replacing soon. If you're planning to rewrite a service in the next quarter, don't refactor the old one.
- Debt that doesn't matter. Code that's imperfect but works is fine. Perfectionism is more expensive than pragmatism.
When NOT to Pay Down Debt
This is important. Paying down all debt is not optimal.
Stable code. If a piece of code works, nobody touches it, and it doesn't cause incidents, refactoring it is risky for low return. "If it ain't broke" might be stupid in other contexts, but it's wise for stable code.
Code with high replacement probability. If you're evaluating replacing a service, don't refactor the old one. Refactoring is investment in the old design. If the old design is going away, invest elsewhere.
Code that's being used as a learning ground. Early in a project, rough code is fine. You're learning what the problem actually is. Refactor when requirements are stable.
Code where the cost of breaking it is higher than the benefit of improving it. Critical transaction path that's complex but working? Probably leave it alone unless you're actively developing it.
The Debt Discussion with PMs and Leadership
This is the hard part. Debt paydown competes with features. Leadership wants features. You want to pay down debt. Here's how to make the case:
"We're going to miss this roadmap item if we don't address debt in service X. It's slowing us down. We estimate one sprint of debt paydown lets us deliver this feature two weeks faster. That's positive ROI."
That works. It ties debt to business impact.
"The codebase is getting messier and I feel like we're moving slower." That doesn't work. It's subjective.
Get numbers. Measure velocity before and after debt paydown if you can (it's hard - there are too many variables). At minimum, estimate: if we pay down this debt, what does the feature velocity look like?
Connecting to Codebase Intelligence
Here's where Glue helps. Debt visibility is the key to managing it. Most teams have no visibility into where debt lives. They have opinions ("that service is a mess") but no data.
Glue lets you ask: "Show me the most complex services by cyclomatic complexity. Show me the services with the most recent changes. Show me the overlap - which services are both complex and changing frequently?" Those are your candidates for debt paydown.
Or: "Show me the services where change velocity has decreased over time. Show me the commits." You're looking for patterns: did velocity decrease gradually or suddenly? Was there a large refactor? Understanding the patterns helps you decide what to do next.
Or: "Show me services with the fewest tests and the highest recent incident rate." Correlation isn't causation, but it's a signal that debt might be causing incidents.
Continuous debt visibility vs. periodic "the codebase is a mess" retrospectives. With visibility, you can be strategic about debt paydown instead of reactive.
Technical Debt in 60 Seconds TL;DR
Debt is intentional shortcuts with a plan to pay back. Deliberate and prudent debt is good. Inadvertent and reckless debt is bad. Measure debt by: complexity metrics, change failure rate, time-to-change velocity, business impact. Pay down debt that blocks roadmap items, lives in high-churn code, or causes incidents. Don't pay down stable code nobody touches. Debt paydown is an ROI calculation: how much faster can we ship after?
Frequently Asked Questions
Q: How much of our development time should be debt paydown?
A: As much as needed to keep velocity stable. If your velocity is decreasing, debt paydown is the answer. If velocity is stable or increasing, you're fine with whatever fraction of time you're spending. Measure velocity before and after paydown sprints to see the impact.
Q: Should we have a debt budget like infrastructure teams have operational budgets?
A: Yes. A good heuristic: 20% of engineering capacity goes to debt paydown or similar non-feature work (testing, observability, documentation). This keeps debt from accumulating uncontrollably while still prioritizing features.
Q: How do we prevent accumulating debt in the first place?
A: Code review, testing, and being honest about shipping fast vs. shipping right. The worst debt comes from shipping fast without a plan to pay back. Ship fast when you need to learn (deliberate debt). But then actually pay back. If debt goes unpaid for years, it's not debt anymore - it's just bad code.