By Arjun Mehta
Your payment system works. Only Alice understands it. Alice is good. She'll probably stay forever.
Probably isn't good enough.
Bus factor is the number of people who have to leave before your project stalls. If your payment system has bus factor 1, Alice leaving is a crisis.
Most teams have dangerously low bus factors. One person per critical system. That person is either overworked (because everything goes through them) or leaves. Either way, you're at risk.
Why This Matters
When bus factor is low, you're not really stable. You're dependent on luck.
If Alice stays, great. If Alice gets a better offer, leaves for personal reasons, or gets sick, you're stuck. Someone new has to learn the system. That's 3-6 months of reduced productivity. In the meantime, bugs in that system take longer to fix. Changes are risky. You're moving slow.
The cost: $100K+ per key person departure.
The solution: deliberately distribute knowledge so no critical system is understood by only one person.
How to Identify Low Bus Factor
Ask these questions for each critical system:
- How many people could troubleshoot this in production at 2 AM?
- If that person left tomorrow, could someone else maintain this?
- Could someone new learn this in less than 2 weeks?
If the answer to any is "only one person," bus factor is 1.
Critical systems: payment processing, authentication, database, API, core business logic.
How to Raise Bus Factor
Strategy 1: Pair programming with intention. When Alice works on the payment system, pair her with Bob. Bob learns. Bob learns not just the code, but the mental model. After 40 hours of pairing, Bob is 60% effective. After 80 hours, Bob is 80% effective. After 3 months, Bob is as effective as Alice.
Strategy 2: Documentation with narrative. Don't just document "how." Document "why." "We chose this approach because X. We considered Y. This is why X wins." Why matters. With the why, new people can make decisions.
Strategy 3: Incidents as learning. When something breaks, don't have Alice fix it alone. Have Bob debug it with Alice. Bob learns the system through debugging.
Strategy 4: Planned rotation. Alice maintains the system for 6 months. Bob becomes co-owner. After 12 months, rotate again. Knowledge spreads.
Strategy 5: Simplify the code. Complex code is hard to learn. Simple code is easy to learn. Break large functions into small ones. Remove unnecessary abstraction. When the code is simple, bus factor goes up naturally.
What Good Bus Factor Looks Like
For critical systems: bus factor 3+. That means the system is understood well enough by 3+ people that you can lose one without stalling.
When bus factor is high:
- Code changes don't wait for one person
- Incidents get resolved by whoever is on call
- Onboarding is faster (more people can teach)
- People can take vacations
- Refactoring happens naturally
The Measurement Tool
Glue can surface bus factor by analyzing your codebase:
- Which systems have the fewest contributors?
- Which systems are most complex (harder to learn)?
- Who owns what?
Now you know where to focus knowledge distribution effort.
Frequently Asked Questions
Q: Isn't specialization good?
Yes. Specialization is good. Bus factor of 1 is not. You can be a specialist and still have backup. Alice is deep on payments, and so is Bob (to 80%). That's specialization with redundancy.
Q: How do we handle the cost of spreading knowledge?
Pair programming is slower than individual work, short term. But long-term, you're much faster because you can parallelize work and don't stall when someone leaves.
Q: What if someone doesn't want to share knowledge?
Address it directly. Make it clear: distributed knowledge isn't optional. It's for the team's resilience.