Standard : Infrastructure is scalable, automated, and adaptable to changing demands
Purpose and Strategic Importance
Infrastructure that is manually provisioned, inconsistently configured, and unable to scale without human intervention is a direct constraint on delivery speed and system reliability. This standard establishes that all infrastructure must be defined as code, provisioned on-demand, and capable of scaling horizontally to meet changing load — without manual intervention. Infrastructure as Code (IaC) using tools such as Terraform or Pulumi ensures that environments are reproducible, version-controlled, and auditable. It eliminates environment drift, reduces the risk of configuration-related incidents, and enables teams to spin up and tear down environments in minutes rather than days.
Aligned to our "Engineering Excellence First" policy, this standard recognises infrastructure as a first-class engineering concern, not an operational afterthought. Cloud-native patterns including auto-scaling, containerisation, and ephemeral environments allow teams to match resource allocation to actual demand, reducing cost while improving resilience. Environment parity — ensuring that development, staging, and production environments are structurally identical — removes the "it works on my machine" class of incident and accelerates the path from code to confident deployment. Infrastructure that cannot keep pace with delivery ambitions will always constrain the speed of the teams that depend on it.
Strategic Impact
- Eliminates manual infrastructure provisioning as a bottleneck to team delivery and environment availability
- Ensures environment consistency through code-defined configuration, eliminating drift and configuration-related incidents
- Enables rapid scaling to meet demand without requiring human intervention or emergency capacity requests
- Reduces cloud spend waste through on-demand provisioning and auto-scaling aligned to actual usage patterns
Risks of Not Having This Standard
- Manual infrastructure processes create long lead times for new environments, directly blocking delivery pipelines
- Configuration drift between environments leads to incidents that are difficult to diagnose and reproduce
- Systems unable to scale horizontally under load result in degraded performance, outages, and customer impact
- Infrastructure knowledge becomes siloed in individuals, creating single points of failure and bus factor risk
- Cost inefficiency grows as over-provisioned static infrastructure accumulates without automated lifecycle management
CMMI Maturity Model
Level 1 – Initial
| Category |
Description |
| People & Culture |
Infrastructure is managed by a small operations team using manual, tribal knowledge. |
| Process & Governance |
Environment provisioning is ad hoc, undocumented, and relies on individual expertise. |
| Technology & Tools |
Servers and environments are provisioned manually via console, scripts, or direct SSH access. |
| Measurement & Metrics |
Infrastructure costs, availability, and provisioning lead times are not systematically tracked. |
Level 2 – Managed
| Category |
Description |
| People & Culture |
Operations and engineering teams begin collaborating on infrastructure provisioning needs. |
| Process & Governance |
Some infrastructure is scripted, but processes remain largely manual and team-specific. |
| Technology & Tools |
Basic cloud services are adopted but provisioned through the console rather than code. |
| Measurement & Metrics |
Environment availability is tracked; provisioning lead times are measured informally. |
Level 3 – Defined
| Category |
Description |
| People & Culture |
Engineers treat infrastructure as code and contribute to IaC repositories alongside application code. |
| Process & Governance |
All environments are provisioned via IaC tools (Terraform, Pulumi, or equivalent) with peer review. |
| Technology & Tools |
Auto-scaling policies are configured for production workloads; container orchestration is adopted. |
| Measurement & Metrics |
Provisioning lead time, scaling events, and environment parity compliance are actively measured. |
Level 4 – Quantitatively Managed
| Category |
Description |
| People & Culture |
Teams own their infrastructure end-to-end using self-service provisioning via platform tooling. |
| Process & Governance |
Infrastructure changes follow the same review and automated testing processes as application code. |
| Technology & Tools |
Ephemeral environments are created per branch or pull request and destroyed automatically on merge. |
| Measurement & Metrics |
Scaling latency, provisioning time, and cost-per-environment are tracked with defined improvement targets. |
Level 5 – Optimising
| Category |
Description |
| People & Culture |
Infrastructure engineering is a core competency distributed across all product delivery teams. |
| Process & Governance |
Infrastructure policies are enforced automatically via policy-as-code tools (e.g., OPA, Sentinel). |
| Technology & Tools |
Intelligent auto-scaling uses predictive models to pre-warm capacity ahead of anticipated demand. |
| Measurement & Metrics |
Infrastructure cost efficiency, elasticity, and resilience metrics continuously improve through feedback loops. |
Key Measures
- Environment provisioning lead time: time from request to fully available environment
- Infrastructure drift incidents: number of incidents attributable to environment configuration inconsistency
- Auto-scaling response time: time for infrastructure to scale in response to load changes
- Percentage of infrastructure defined and managed as code versus manually provisioned
- Mean time to provision a new environment from scratch using IaC pipelines
- Infrastructure cost variance: deviation between provisioned capacity cost and actual demand cost