Standard : AI models are deployed via automated, repeatable pipelines
Purpose and Strategic Importance
This standard requires that AI models are deployed through automated, version-controlled pipelines that are consistent across environments and repeatable without manual intervention. It supports the policy of reducing time from data to deployed intelligence by eliminating the manual, error-prone handoffs that make model deployment slow and risky. When deployment is a push-button, auditable process, teams can ship improvements frequently and safely rather than treating each release as a high-stakes, all-hands exercise.
Strategic Impact
- Reduces mean deployment time from days or weeks to hours or minutes, enabling faster iteration on model improvements
- Eliminates manual deployment steps that are sources of error, inconsistency, and knowledge concentration risk
- Creates an audit trail of every deployment action, supporting governance and incident investigation
- Enables confident rollback to known-good model versions within minutes when issues are detected in production
- Builds the infrastructure foundation for continuous delivery of AI improvements without proportionally increasing engineering overhead
Risks of Not Having This Standard
- Manual deployment processes become bottlenecks that limit how frequently AI improvements can reach production
- Deployment errors caused by manual steps damage production systems and erode stakeholder confidence
- Knowledge of how to deploy a specific model concentrates in individuals, creating single points of failure
- Inconsistencies between environments caused by undocumented manual steps produce silent failures in production
- Rollback is slow and uncertain when deployment steps were not automated and documented, extending incident resolution time
CMMI Maturity Model
Level 1 – Initial
| Category |
Description |
| People & Culture |
- Model deployment is performed manually by a small number of individuals; steps are undocumented and tribal |
| Process & Governance |
- No deployment standard; each model release is a bespoke exercise requiring coordination across multiple teams |
| Technology & Tools |
- Models are deployed by copying files and updating configurations manually; there is no pipeline infrastructure |
| Measurement & Metrics |
- Deployment frequency and lead time are not tracked; the team has no visibility into deployment throughput |
Level 2 – Managed
| Category |
Description |
| People & Culture |
- Deployment steps are documented in a runbook; multiple team members are capable of executing a deployment |
| Process & Governance |
- A deployment checklist is in use; deployments require a review sign-off before execution |
| Technology & Tools |
- Basic scripting automates the most error-prone manual steps; models are deployed from a central artefact store |
| Measurement & Metrics |
- Deployment lead time and error rate are tracked manually; the team reviews deployment metrics in retrospectives |
Level 3 – Defined
| Category |
Description |
| People & Culture |
- Automated deployment is the team norm; manual deployments are treated as exceptions requiring justification |
| Process & Governance |
- A CI/CD pipeline for model deployment is defined and enforced; all deployments must pass automated quality gates |
| Technology & Tools |
- A full MLOps pipeline covers model packaging, environment provisioning, canary or blue-green deployment, and automated smoke testing |
| Measurement & Metrics |
- Deployment frequency, lead time, and change failure rate are measured and reported continuously |
Level 4 – Quantitatively Managed
| Category |
Description |
| People & Culture |
- Teams own and improve their deployment pipelines; pipeline performance is a first-class engineering metric |
| Process & Governance |
- Deployment SLAs are defined per model risk tier; pipeline performance is reviewed in engineering governance |
| Technology & Tools |
- Progressive deployment strategies (shadow mode, canary, feature flags) are applied by default; rollback is fully automated |
| Measurement & Metrics |
- Deployment lead time, mean time to recovery, and deployment frequency are benchmarked against DORA targets for AI systems |
Level 5 – Optimising
| Category |
Description |
| People & Culture |
- Teams share pipeline improvements across the organisation; deployment engineering is treated as a strategic capability |
| Process & Governance |
- Pipeline standards evolve continuously based on incident learnings and advances in MLOps tooling |
| Technology & Tools |
- Self-healing pipelines detect and recover from failures automatically; AI-assisted pipeline optimisation identifies bottlenecks |
| Measurement & Metrics |
- Deployment pipeline metrics are used to forecast delivery capacity and inform team resourcing decisions |
Key Measures
- Percentage of AI model deployments executed via the automated pipeline (versus manual intervention)
- Mean deployment lead time from model approval to production availability
- Deployment success rate (deployments completed without rollback or incident)
- Mean time to rollback a failed deployment to a known-good model version
- Deployment frequency per model over a rolling 90-day window