Standard : AI systems provide explainable outputs for high-stakes decisions

Purpose and Strategic Importance

This standard requires that AI systems involved in high-stakes decisions — those with material consequences for individuals, business operations, or safety — must produce outputs accompanied by an explanation that a human reviewer or affected party can understand and act upon. It supports the policy of designing for explainability and not just accuracy by making explainability a functional requirement, not an afterthought. A model with superior accuracy but no explainability is often less deployable in practice than a slightly less accurate model whose reasoning can be interrogated and defended.

Strategic Impact

Enables human reviewers to make informed decisions about whether to accept, modify, or override AI outputs
Supports regulatory compliance in jurisdictions that grant individuals the right to explanation for automated decisions
Builds user and stakeholder trust by making AI reasoning visible, challengeable, and auditable
Facilitates model debugging and improvement by exposing which features and patterns drive specific outcomes
Reduces the liability exposure of deploying AI in consequential contexts by creating an accountable reasoning record

Risks of Not Having This Standard

Human reviewers cannot meaningfully oversee AI decisions they cannot understand, reducing the value of the human-in-the-loop control
Regulatory challenges succeed when the organisation cannot explain the basis of an automated decision to regulators or courts
Users who are adversely affected by AI decisions cannot challenge them effectively without access to a clear explanation
Model debugging is slow and imprecise because feature importance and decision pathways are opaque
The organisation develops a culture of deferred accountability in which AI decisions are trusted without question because they cannot be interrogated

CMMI Maturity Model

Level 1 – Initial

Category	Description
People & Culture	- Explainability is not considered a design requirement; models are evaluated purely on predictive performance
Process & Governance	- No explainability requirement in the AI design or deployment process; black-box models are used without restriction
Technology & Tools	- No explainability tooling; model reasoning is entirely opaque to operators and users
Measurement & Metrics	- Explainability is not measured; there is no baseline for what level of explanation is available or usable

Level 2 – Managed

Category	Description
People & Culture	- Teams identify high-stakes use cases where explainability is expected by users or regulators
Process & Governance	- A requirement to provide feature-level explanation for high-stakes AI decisions is added to design standards
Technology & Tools	- SHAP or LIME is applied post-hoc to generate feature importance explanations for individual predictions
Measurement & Metrics	- Availability of explanations for high-stakes decisions is tracked; the team reviews a sample of explanations for intelligibility

Level 3 – Defined

Category	Description
People & Culture	- Explainability is a design criterion evaluated at architecture review; the team includes user-facing explanation design in product specifications
Process & Governance	- A tiered explainability standard defines the minimum explanation type required per decision risk tier (feature attribution, counterfactual, natural language summary)
Technology & Tools	- Explainability methods are integrated into the inference pipeline; explanations are generated at prediction time and stored alongside outputs
Measurement & Metrics	- Explanation coverage rate (proportion of high-stakes predictions with a stored explanation) and user comprehension testing results are tracked

Level 4 – Quantitatively Managed

Category	Description
People & Culture	- User-tested explanation quality metrics are tracked; teams iterate on explanation design based on user comprehension and review efficiency data
Process & Governance	- Explanation quality thresholds gate deployment for high-risk use cases; explanation fidelity and comprehensibility are assessed at model release
Technology & Tools	- Explainability methods are selected based on their fidelity to the model's actual reasoning process, not just their interpretability to users
Measurement & Metrics	- Explanation fidelity, user comprehension rate, reviewer decision efficiency improvement, and challenge rate from affected parties are measured per use case

Level 5 – Optimising

Category	Description
People & Culture	- Explainability design is treated as a user experience discipline; affected community feedback informs explanation format and vocabulary
Process & Governance	- Explainability standards are continuously updated based on regulatory developments, user research, and advances in interpretable AI
Technology & Tools	- Inherently interpretable model architectures are preferred for high-risk use cases where performance allows; post-hoc methods are reserved for cases where complex models are necessary
Measurement & Metrics	- Long-term tracking of challenge and appeal rates informs the adequacy of explanation provision; reduction in successful challenges indicates improving explanation quality

Key Measures

Percentage of high-stakes AI decisions with an associated stored explanation meeting the defined standard for that risk tier
User comprehension rate for AI explanations measured through usability testing (proportion of users who can correctly interpret the explanation)
Reviewer decision time improvement when AI explanations are provided versus withheld (efficiency metric)
Number of successful regulatory or legal challenges where inadequate explanation was cited as a factor
Explanation fidelity score (degree to which the explanation accurately represents the model's reasoning) per model release