By Arjun Mehta

DORA Metrics: The Complete Guide for Engineering Leaders

DORA metrics are the gold standard for measuring engineering team performance. They were identified by Google's DevOps Research and Assessment program and validated across thousands of companies.

Unlike vanity metrics or activity metrics, DORA metrics measure outcomes that directly correlate with business success: how quickly you ship, how reliable your systems are, and how fast you recover from failures.

Yet most teams either ignore DORA metrics or measure them incorrectly.

This guide explains what the four DORA metrics actually measure, why they matter, how to measure them correctly, and how to improve them.

The Four DORA Metrics

1. Deployment Frequency

What it measures: How often you deploy code to production.

Why it matters: High deployment frequency means:

You're shipping value constantly
You have the infrastructure to deploy safely
You can respond to problems quickly
Your team is not blocked by long release cycles

Elite teams: Deploy multiple times per day High performers: Deploy 1-7 times per week Medium performers: Deploy 1-4 times per month Low performers: Deploy every few months or less frequently

How to measure: Count production deployments per day/week/month.

Note: This is tricky to define. Does a canary deployment to 5% of users count? Does a feature flag change count? The answer is: measure what matters. If you're not shipping value, it shouldn't count.

2. Lead Time for Changes

What it measures: How long from commit to production.

Why it matters: Lead time indicates:

How quickly you can respond to customer needs
How much friction is in your deployment process
Whether you're blocked waiting for reviews, tests, or manual approvals

Elite teams: < 1 hour commit to production High performers: 1-24 hours Medium performers: 1-7 days Low performers: > 1 week

How to measure: Track the time from commit merge to production deployment.

Gotcha: Don't include time waiting for approval or manual reviews. That's not part of the system. Unless your system requires reviews, in which case it is.

3. Mean Time to Recovery (MTTR)

What it measures: How long to fix a production incident.

Why it matters: Incidents happen. What matters is how fast you recover:

Can you detect issues quickly?
Can you identify the root cause?
Can you deploy a fix?

Elite teams: < 1 hour High performers: 1-24 hours Medium performers: 1-7 days Low performers: > 1 week

How to measure: Track time from incident detection to resolution in production.

Gotcha: This is hard to measure consistently. What counts as "resolution"? Code deployed? Monitoring shows recovery? Customer-facing impact gone?

Be consistent about your definition.

4. Change Failure Rate

What it measures: What percentage of deployments cause problems requiring rollback or hotfix.

Why it matters: This measures the quality and safety of your deployments:

Do you test thoroughly?
Do you catch issues before production?
Can you safely deploy frequently?

Elite teams: 0-15% change failure rate High performers: 16-30% Medium performers: 31-45% Low performers: > 45%

How to measure: Count failed deployments / total deployments.

Gotcha: What counts as "failed"? A P2 bug that required a hotfix? A feature flag disable? A rollback? Define it clearly.

Why These Metrics?

These four metrics were chosen because they:

Correlate with business outcomes: Companies with elite DORA metrics have higher customer satisfaction, faster time-to-market, and better financial performance.
Don't reward bad behavior: You can't game these metrics by working overtime or cutting corners. Gaming one metric usually hurts the others.
Are actionable: They point to specific areas for improvement (deployment frequency, test reliability, monitoring, incident response).
Are culture-agnostic: They work for startups and enterprises, monoliths and microservices, waterfall and agile teams.

Anti-Patterns and Pitfalls

Anti-Pattern 1: Chasing Metrics Instead of Outcomes

If you optimize for deployment frequency at the expense of reliability, you'll ship broken code faster. That's not progress.

Right approach: Optimize all four metrics together. They should move in the same direction.

Anti-Pattern 2: Measuring Individual Contribution

"Our deployment frequency is low because developers aren't shipping enough code."

Wrong. Deployment frequency is a team and system metric. It's limited by:

How fast tests run
How complex code review is
How much manual approval is required
Whether infrastructure is reliable

Individual developers can't improve deployment frequency alone.

Anti-Pattern 3: Ignoring Context

A monolithic legacy system with 1000 services worth of dependencies will have different DORA metrics than a microservices architecture. That's okay. Compare yourself to your own baseline, not to Google.

Anti-Pattern 4: Missing Data

Measuring DORA metrics correctly requires solid tooling and data:

Version control integrations
CI/CD system logs
Incident tracking
Production monitoring

If you're estimating these metrics by hand, your data is wrong.

How to Improve DORA Metrics

To Improve Deployment Frequency

Reduce batch size: Deploy smaller changes more frequently
Reduce lead time: Fix bottlenecks in your deployment process
Increase automation: Remove manual approval steps
Use feature flags: Deploy incomplete features hidden behind flags

To Improve Lead Time

Make tests fast: Slow tests create bottlenecks
Parallelize testing: Run unit, integration, and E2E tests in parallel
Reduce review time: Don't wait weeks for code review
Automate gates: Let CI/CD check for quality, not humans

To Improve MTTR

Improve monitoring: Detect issues faster
Invest in incident response: Document runbooks, practice incident response
Make rollback easy: Zero-downtime deployments, reversible migrations
Reduce complexity: Simpler systems are easier to debug

To Improve Change Failure Rate

Improve testing: Better test coverage, faster feedback
Use canary deployments: Detect issues before they hit all users
Automate code quality checks: Lint, type checks, security scans
Reduce blast radius: Deploy to small subset first

The Counter-Intuitive Truth About DORA Metrics

Elite teams deploy more frequently AND have lower change failure rates. They're not trading off quality for speed.

Why?

Smaller deployments are safer: A 10-line commit is less risky than a 1000-line commit
Better infrastructure: Elite teams invest in testing, monitoring, and CI/CD
Better practices: Code review, feature flags, canary deployments
Better culture: Blameless incident response, continuous learning

If you're trading off quality for frequency, you're doing it wrong.

Measuring DORA Metrics in Practice

Setup

Version control: Ensure all deployments are tracked in your VCS
CI/CD logging: Capture when builds happen and deployments deploy
Incident tracking: Log incidents with timestamps
Monitoring: Track when issues are detected

Calculation

Deployment Frequency: Count production deployments per week/month
Lead Time: Measure time from commit to deployment for a random sample of deployments
MTTR: Measure time from incident detection to resolution for a sample of incidents
Change Failure Rate: Count failed deployments / total deployments

Tools

GitLab CI: Built-in DORA metrics dashboards
GitHub + custom integration: Use GitHub API + monitoring data
Jira + custom integration: Track deployments and incidents
Datadog, New Relic: Some monitoring platforms calculate DORA metrics

The Bigger Picture

DORA metrics are one lens on team performance. They measure engineering velocity and reliability.

They don't measure:

Product decisions (are you building the right thing?)
Code quality (is the code maintainable?)
Technical debt (are you accumulating debt?)
Developer satisfaction (are people happy?)

Use DORA metrics alongside other metrics:

Code quality metrics: Test coverage, code review comments, refactoring rate
Product metrics: Customer satisfaction, feature adoption, time-to-value
Team metrics: Developer satisfaction, onboarding time, retention

Other DORA metrics give you how fast. You need other metrics to know how well and toward what.

Getting Started

Establish baselines: Measure where you are now
Pick the biggest bottleneck: Which metric is worst?
Identify the root cause: Why is it bad?
Make one improvement: Fix one thing at a time
Remeasure: Did it improve?
Repeat: Continuous improvement

DORA metrics aren't a destination. They're a framework for continuous improvement toward engineering excellence.

Frequently Asked Questions

Q: Should we compare our metrics to other companies? A: Only for context. Your 2 deployments per day might be elite for your domain and low for another. Compare yourself to your own baseline and your previous quarter.

Q: Our change failure rate is high. Does that mean we should deploy less? A: No. Deploying less won't fix quality. Fix the root cause: unreliable tests, missing monitoring, or lack of testing discipline. Then deploy more frequently with confidence.

Q: What if we can't measure some metrics accurately? A: Start with what you can measure. Even rough measurements are better than none. Improve measurement over time as you invest in tooling.

By Arjun Mehta

DORA Metrics: The Complete Guide for Engineering Leaders

DORA metrics are the gold standard for measuring engineering team performance. They were identified by Google's DevOps Research and Assessment program and validated across thousands of companies.

Yet most teams either ignore DORA metrics or measure them incorrectly.

This guide explains what the four DORA metrics actually measure, why they matter, how to measure them correctly, and how to improve them.

The Four DORA Metrics

1. Deployment Frequency

What it measures: How often you deploy code to production.

Why it matters: High deployment frequency means:

You're shipping value constantly
You have the infrastructure to deploy safely
You can respond to problems quickly
Your team is not blocked by long release cycles

How to measure: Count production deployments per day/week/month.

2. Lead Time for Changes

What it measures: How long from commit to production.

Why it matters: Lead time indicates:

How quickly you can respond to customer needs
How much friction is in your deployment process
Whether you're blocked waiting for reviews, tests, or manual approvals

Elite teams: < 1 hour commit to production High performers: 1-24 hours Medium performers: 1-7 days Low performers: > 1 week

How to measure: Track the time from commit merge to production deployment.

Gotcha: Don't include time waiting for approval or manual reviews. That's not part of the system. Unless your system requires reviews, in which case it is.

3. Mean Time to Recovery (MTTR)

What it measures: How long to fix a production incident.

Why it matters: Incidents happen. What matters is how fast you recover:

Can you detect issues quickly?
Can you identify the root cause?
Can you deploy a fix?

Elite teams: < 1 hour High performers: 1-24 hours Medium performers: 1-7 days Low performers: > 1 week

How to measure: Track time from incident detection to resolution in production.

Gotcha: This is hard to measure consistently. What counts as "resolution"? Code deployed? Monitoring shows recovery? Customer-facing impact gone?

Be consistent about your definition.

4. Change Failure Rate

What it measures: What percentage of deployments cause problems requiring rollback or hotfix.

Why it matters: This measures the quality and safety of your deployments:

Do you test thoroughly?
Do you catch issues before production?
Can you safely deploy frequently?

Elite teams: 0-15% change failure rate High performers: 16-30% Medium performers: 31-45% Low performers: > 45%

How to measure: Count failed deployments / total deployments.

Gotcha: What counts as "failed"? A P2 bug that required a hotfix? A feature flag disable? A rollback? Define it clearly.

Why These Metrics?

These four metrics were chosen because they:

Correlate with business outcomes: Companies with elite DORA metrics have higher customer satisfaction, faster time-to-market, and better financial performance.
Don't reward bad behavior: You can't game these metrics by working overtime or cutting corners. Gaming one metric usually hurts the others.
Are actionable: They point to specific areas for improvement (deployment frequency, test reliability, monitoring, incident response).
Are culture-agnostic: They work for startups and enterprises, monoliths and microservices, waterfall and agile teams.

Anti-Patterns and Pitfalls

Anti-Pattern 1: Chasing Metrics Instead of Outcomes

If you optimize for deployment frequency at the expense of reliability, you'll ship broken code faster. That's not progress.

Right approach: Optimize all four metrics together. They should move in the same direction.

Anti-Pattern 2: Measuring Individual Contribution

"Our deployment frequency is low because developers aren't shipping enough code."

Wrong. Deployment frequency is a team and system metric. It's limited by:

How fast tests run
How complex code review is
How much manual approval is required
Whether infrastructure is reliable

Individual developers can't improve deployment frequency alone.

Anti-Pattern 3: Ignoring Context

Anti-Pattern 4: Missing Data

Measuring DORA metrics correctly requires solid tooling and data:

Version control integrations
CI/CD system logs
Incident tracking
Production monitoring

If you're estimating these metrics by hand, your data is wrong.

How to Improve DORA Metrics

To Improve Deployment Frequency

Reduce batch size: Deploy smaller changes more frequently
Reduce lead time: Fix bottlenecks in your deployment process
Increase automation: Remove manual approval steps
Use feature flags: Deploy incomplete features hidden behind flags

To Improve Lead Time

Make tests fast: Slow tests create bottlenecks
Parallelize testing: Run unit, integration, and E2E tests in parallel
Reduce review time: Don't wait weeks for code review
Automate gates: Let CI/CD check for quality, not humans

To Improve MTTR

Improve monitoring: Detect issues faster
Invest in incident response: Document runbooks, practice incident response
Make rollback easy: Zero-downtime deployments, reversible migrations
Reduce complexity: Simpler systems are easier to debug

To Improve Change Failure Rate

Improve testing: Better test coverage, faster feedback
Use canary deployments: Detect issues before they hit all users
Automate code quality checks: Lint, type checks, security scans
Reduce blast radius: Deploy to small subset first

The Counter-Intuitive Truth About DORA Metrics

Elite teams deploy more frequently AND have lower change failure rates. They're not trading off quality for speed.

Why?

Smaller deployments are safer: A 10-line commit is less risky than a 1000-line commit
Better infrastructure: Elite teams invest in testing, monitoring, and CI/CD
Better practices: Code review, feature flags, canary deployments
Better culture: Blameless incident response, continuous learning

If you're trading off quality for frequency, you're doing it wrong.

Measuring DORA Metrics in Practice

Setup

Version control: Ensure all deployments are tracked in your VCS
CI/CD logging: Capture when builds happen and deployments deploy
Incident tracking: Log incidents with timestamps
Monitoring: Track when issues are detected

Calculation

Deployment Frequency: Count production deployments per week/month
Lead Time: Measure time from commit to deployment for a random sample of deployments
MTTR: Measure time from incident detection to resolution for a sample of incidents
Change Failure Rate: Count failed deployments / total deployments

Tools

GitLab CI: Built-in DORA metrics dashboards
GitHub + custom integration: Use GitHub API + monitoring data
Jira + custom integration: Track deployments and incidents
Datadog, New Relic: Some monitoring platforms calculate DORA metrics

The Bigger Picture

DORA metrics are one lens on team performance. They measure engineering velocity and reliability.

They don't measure:

Product decisions (are you building the right thing?)
Code quality (is the code maintainable?)
Technical debt (are you accumulating debt?)
Developer satisfaction (are people happy?)

Use DORA metrics alongside other metrics:

Code quality metrics: Test coverage, code review comments, refactoring rate
Product metrics: Customer satisfaction, feature adoption, time-to-value
Team metrics: Developer satisfaction, onboarding time, retention

Other DORA metrics give you how fast. You need other metrics to know how well and toward what.

Getting Started

Establish baselines: Measure where you are now
Pick the biggest bottleneck: Which metric is worst?
Identify the root cause: Why is it bad?
Make one improvement: Fix one thing at a time
Remeasure: Did it improve?
Repeat: Continuous improvement

DORA metrics aren't a destination. They're a framework for continuous improvement toward engineering excellence.

Frequently Asked Questions

Q: What if we can't measure some metrics accurately? A: Start with what you can measure. Even rough measurements are better than none. Improve measurement over time as you invest in tooling.

DORA Metrics: The Complete Guide for Engineering Leaders

DORA Metrics: The Complete Guide for Engineering Leaders

The Four DORA Metrics

1. Deployment Frequency

2. Lead Time for Changes

3. Mean Time to Recovery (MTTR)

4. Change Failure Rate

Why These Metrics?

Anti-Patterns and Pitfalls

Anti-Pattern 1: Chasing Metrics Instead of Outcomes

Anti-Pattern 2: Measuring Individual Contribution

Anti-Pattern 3: Ignoring Context

Anti-Pattern 4: Missing Data

How to Improve DORA Metrics

To Improve Deployment Frequency

To Improve Lead Time

To Improve MTTR

To Improve Change Failure Rate

The Counter-Intuitive Truth About DORA Metrics

Measuring DORA Metrics in Practice

Setup

Calculation

Tools

The Bigger Picture

Getting Started

Frequently Asked Questions

More articles

Shift Left: How Moving Testing Earlier Cuts Defect Costs by 100x

Incident Management: From Alert to Resolution to Prevention

Feature Flags: The Complete Guide to Safe, Fast Feature Releases

DORA Metrics: The Complete Guide for Engineering Leaders

DORA Metrics: The Complete Guide for Engineering Leaders

The Four DORA Metrics

1. Deployment Frequency

2. Lead Time for Changes

3. Mean Time to Recovery (MTTR)

4. Change Failure Rate

Why These Metrics?

Anti-Patterns and Pitfalls

Anti-Pattern 1: Chasing Metrics Instead of Outcomes

Anti-Pattern 2: Measuring Individual Contribution

Anti-Pattern 3: Ignoring Context

Anti-Pattern 4: Missing Data

How to Improve DORA Metrics

To Improve Deployment Frequency

To Improve Lead Time

To Improve MTTR

To Improve Change Failure Rate

The Counter-Intuitive Truth About DORA Metrics

Measuring DORA Metrics in Practice

Setup

Calculation

Tools

The Bigger Picture

Getting Started

Frequently Asked Questions

More articles

Shift Left: How Moving Testing Earlier Cuts Defect Costs by 100x

Incident Management: From Alert to Resolution to Prevention

Feature Flags: The Complete Guide to Safe, Fast Feature Releases