Ragan McGill

Policy : Ensure AI Decisions Are Reviewable by Humans

Commitment to Human Oversight of AI Decisions The question of when AI decisions require human review is not primarily a technical question — it is an ethical, legal, and organisational one. It asks: who is accountable for this decision? What are the consequences if it is wrong? Can the affected person challenge it? Is the AI system reliable enough, and the stakes low enough, for automation to be appropriate? Our commitment is to answer these questions honestly for every AI system we build, and to design human review mechanisms that are genuinely effective — not checkbox processes that provide the appearance of oversight without the substance.

What This Means Human-in-the-loop oversight means different things at different risk levels. For low-stakes, high-volume automation with strong safeguards, it may mean periodic sampling and review. For high-stakes decisions affecting individuals' access to services, opportunities, or resources, it means human sign-off on every decision. The right level of oversight is determined by the stakes, the system's demonstrated reliability, and the regulatory requirements — not by what is most convenient for throughput.

Our commitment to ensuring AI decisions are reviewable by humans is built on:

Risk Tier Classification – Every AI decision type is classified by risk tier based on the potential consequences of errors: impact on individuals, reversibility, regulatory obligations, and organisational liability. Risk tier determines the required level of human oversight, from periodic sampling to mandatory individual review.
Review Mechanism Design – Human review mechanisms are designed to be genuinely effective, not performative. Reviewers are given sufficient context, time, and tools to make meaningful assessments — not presented with AI outputs in a format that makes rubber-stamping the path of least resistance.
Override Capability and Tracking – Human reviewers have the authority and tooling to override AI decisions. Override rates are tracked and analysed: low override rates on high-stakes decisions may indicate effective AI, or may indicate that reviewers feel unable or unwilling to challenge the system.
Escalation Pathways – AI systems have defined escalation pathways: conditions under which a case is automatically escalated to a more experienced human reviewer rather than processed by standard review. Escalation triggers are defined for edge cases, low-confidence outputs, and cases matching known failure patterns.
Review Queue Management – Human review workloads are actively managed to ensure that review is not a bottleneck that creates pressure to rubber-stamp decisions. Review capacity is provisioned based on expected volume and appropriate review time — not set arbitrarily and then treated as a throughput constraint.
Reviewer Competency – Human reviewers have the domain knowledge and AI literacy needed to assess AI decisions meaningfully. Reviewers who do not understand what the AI system is doing, or cannot interpret its outputs, cannot provide genuine oversight. Reviewer training is part of the system deployment.
Regulatory Compliance Mapping – For AI systems subject to specific regulatory requirements — financial services, healthcare, credit decisions, employment — human review mechanisms are designed to meet those requirements explicitly. Compliance is designed in, not assumed post-deployment.

Why This Matters Regulatory frameworks including the EU AI Act, financial services regulations, and sector-specific requirements are increasingly mandating human oversight for high-risk AI systems. Beyond regulatory compliance, human oversight is the mechanism by which we maintain meaningful accountability for decisions that affect people. An organisation that cannot point to a human decision-maker accountable for the outcomes of its AI systems is an organisation that has outsourced its accountability to a system that cannot hold it. Human review is not a constraint on AI efficiency — it is the governance structure that makes AI deployment legitimate.

Our Expectation Every AI system with material consequences for individuals or the organisation has a documented human oversight model proportionate to its risk tier, with effective review mechanisms, defined escalation paths, and accountability structures that identify named humans responsible for decision outcomes. AI decisions that are not reviewable by humans are not deployable in high-stakes contexts. Ensuring human reviewability is how we keep AI Safer — and how we maintain the accountability that stakeholders, regulators, and the people we affect rightfully expect.