Standard : Change Failure Rate (CFR)
Description
Change Failure Rate (CFR) measures the percentage of changes that result in a failure in production—such as incidents, degraded service, or urgent fixes. It is one of the four key DORA metrics and reflects the quality and reliability of your delivery process.
A low CFR indicates strong testing, effective review, and stable deployment practices. CFR helps teams balance speed with safety and build confidence in frequent releases.
How to Use
What to Measure
- Track how many production changes (deployments) caused a failure.
- Failures include P1/P2 incidents, customer-visible errors, rollbacks, or hotfixes.
CFR = (Failed Changes / Total Changes) x 100
Instrumentation Tips
- Use incident tracking systems to flag failed deploys.
- Define a consistent taxonomy for “failure” (e.g. severity levels, SLA breaches).
- Correlate deployment events with incident timelines or rollback activity.
DORA Benchmarks
Performance Level |
Change Failure Rate |
Elite |
0–15% |
High |
0–15% |
Medium |
16–30% |
Low |
46–60% |
Why It Matters
- Quality signal: Reflects deployment health and reliability.
- Customer trust: High CFR leads to more disruptions and lower satisfaction.
- Team confidence: Frequent failures erode confidence in the release process.
- Learning: CFR insights help guide investment in testing, observability, and resilience.
Best Practices
- Invest in robust automated testing before merge and deploy.
- Use progressive delivery (e.g. canary, blue-green, feature flags).
- Instrument for observability and alert on post-deploy anomalies.
- Standardise rollback processes for failed changes.
- Review failures in blameless postmortems and track recurring patterns.
Common Pitfalls
- Using inconsistent or vague definitions of “failure.”
- Tracking only rolled-back changes (some failures persist unnoticed).
- Ignoring near-misses or degraded performance incidents.
- Overfocusing on CFR at the expense of deployment frequency.
Signals of Success
- Fewer incidents caused by changes, even with frequent deployments.
- Failure causes are understood and addressed at the root.
- Post-deploy issues trend downward over time.
- Testing and monitoring practices evolve based on CFR insights.
- [[CoE/Measures/Delivery Performance/Deployment Frequency]]
- [[Lead Time for Change]]
- [[Mean Time to Recovery (MTTR)]]
- [[Postmortem Completion Rate]]
Aligned Industry Research
- State of DevOps / DORA (2021–2023): Elite performers maintain CFR below 15% while deploying multiple times per day.
- Accelerate (Forsgren et al.): CFR is a key measure of software delivery performance that correlates with organisational outcomes like profitability, productivity, and customer satisfaction.