• Home
  • BVSSH
  • C4E
  • Playbooks
  • Frameworks
  • Good Reads
Search

What are you looking for?

Standard : AI systems provide explainable outputs for high-stakes decisions

Purpose and Strategic Importance

This standard requires that AI systems involved in high-stakes decisions — those with material consequences for individuals, business operations, or safety — must produce outputs accompanied by an explanation that a human reviewer or affected party can understand and act upon. It supports the policy of designing for explainability and not just accuracy by making explainability a functional requirement, not an afterthought. A model with superior accuracy but no explainability is often less deployable in practice than a slightly less accurate model whose reasoning can be interrogated and defended.

Strategic Impact

  • Enables human reviewers to make informed decisions about whether to accept, modify, or override AI outputs
  • Supports regulatory compliance in jurisdictions that grant individuals the right to explanation for automated decisions
  • Builds user and stakeholder trust by making AI reasoning visible, challengeable, and auditable
  • Facilitates model debugging and improvement by exposing which features and patterns drive specific outcomes
  • Reduces the liability exposure of deploying AI in consequential contexts by creating an accountable reasoning record

Risks of Not Having This Standard

  • Human reviewers cannot meaningfully oversee AI decisions they cannot understand, reducing the value of the human-in-the-loop control
  • Regulatory challenges succeed when the organisation cannot explain the basis of an automated decision to regulators or courts
  • Users who are adversely affected by AI decisions cannot challenge them effectively without access to a clear explanation
  • Model debugging is slow and imprecise because feature importance and decision pathways are opaque
  • The organisation develops a culture of deferred accountability in which AI decisions are trusted without question because they cannot be interrogated

CMMI Maturity Model

Level 1 – Initial

Category Description
People & Culture - Explainability is not considered a design requirement; models are evaluated purely on predictive performance
Process & Governance - No explainability requirement in the AI design or deployment process; black-box models are used without restriction
Technology & Tools - No explainability tooling; model reasoning is entirely opaque to operators and users
Measurement & Metrics - Explainability is not measured; there is no baseline for what level of explanation is available or usable

Level 2 – Managed

Category Description
People & Culture - Teams identify high-stakes use cases where explainability is expected by users or regulators
Process & Governance - A requirement to provide feature-level explanation for high-stakes AI decisions is added to design standards
Technology & Tools - SHAP or LIME is applied post-hoc to generate feature importance explanations for individual predictions
Measurement & Metrics - Availability of explanations for high-stakes decisions is tracked; the team reviews a sample of explanations for intelligibility

Level 3 – Defined

Category Description
People & Culture - Explainability is a design criterion evaluated at architecture review; the team includes user-facing explanation design in product specifications
Process & Governance - A tiered explainability standard defines the minimum explanation type required per decision risk tier (feature attribution, counterfactual, natural language summary)
Technology & Tools - Explainability methods are integrated into the inference pipeline; explanations are generated at prediction time and stored alongside outputs
Measurement & Metrics - Explanation coverage rate (proportion of high-stakes predictions with a stored explanation) and user comprehension testing results are tracked

Level 4 – Quantitatively Managed

Category Description
People & Culture - User-tested explanation quality metrics are tracked; teams iterate on explanation design based on user comprehension and review efficiency data
Process & Governance - Explanation quality thresholds gate deployment for high-risk use cases; explanation fidelity and comprehensibility are assessed at model release
Technology & Tools - Explainability methods are selected based on their fidelity to the model's actual reasoning process, not just their interpretability to users
Measurement & Metrics - Explanation fidelity, user comprehension rate, reviewer decision efficiency improvement, and challenge rate from affected parties are measured per use case

Level 5 – Optimising

Category Description
People & Culture - Explainability design is treated as a user experience discipline; affected community feedback informs explanation format and vocabulary
Process & Governance - Explainability standards are continuously updated based on regulatory developments, user research, and advances in interpretable AI
Technology & Tools - Inherently interpretable model architectures are preferred for high-risk use cases where performance allows; post-hoc methods are reserved for cases where complex models are necessary
Measurement & Metrics - Long-term tracking of challenge and appeal rates informs the adequacy of explanation provision; reduction in successful challenges indicates improving explanation quality

Key Measures

  • Percentage of high-stakes AI decisions with an associated stored explanation meeting the defined standard for that risk tier
  • User comprehension rate for AI explanations measured through usability testing (proportion of users who can correctly interpret the explanation)
  • Reviewer decision time improvement when AI explanations are provided versus withheld (efficiency metric)
  • Number of successful regulatory or legal challenges where inadequate explanation was cited as a factor
  • Explanation fidelity score (degree to which the explanation accurately represents the model's reasoning) per model release
Associated Policies
Associated Practices
  • Responsible AI Framework Adoption
  • AI Ethics Review Board
  • Model Explainability Techniques
  • Model Card Documentation

Technical debt is like junk food - easy now, painful later.

Awesome Blogs
  • LinkedIn Engineering
  • Github Engineering
  • Uber Engineering
  • Code as Craft
  • Medium.engineering