Ragan McGill

Policy : Treat AI Failure Modes as Known and Planned For

Commitment to Proactive AI Failure Planning AI systems fail. Not occasionally, not unpredictably, not in ways that are fundamentally unknowable — but in ways that are largely predictable if teams invest the time to think carefully about failure before it occurs. The failure modes of machine learning systems are well documented: hallucination, distributional shift, adversarial vulnerability, feedback loop degradation, silent accuracy decay, and cascading errors through automated pipelines. Our commitment is to treat these failure modes as first-class design concerns — to identify them, plan for them, and build systems that fail gracefully rather than catastrophically.

What This Means Planning for AI failure modes means conducting structured failure analysis before deployment, building detection mechanisms into production systems, and defining explicit responses for each identified failure scenario. It means accepting that AI systems will sometimes produce wrong, unexpected, or harmful outputs — and engineering for that reality rather than pretending it away. It means the question is not "will this system fail?" but "when it fails, what happens, and have we prepared for it?"

Our commitment to treating AI failure modes as known and planned for is built on:

Pre-Deployment Failure Mode Analysis – Every AI system undergoes structured failure mode analysis before deployment. This draws on domain expertise, historical failure patterns in similar systems, adversarial thinking, and explicit consideration of what happens when the model encounters conditions outside its training distribution.
Categorised Failure Taxonomies – Failure modes are categorised by type (accuracy degradation, distributional shift, adversarial input, feedback loop corruption, dependency failure) and by severity (impact to users, business, or safety). This taxonomy drives prioritisation of detection and mitigation investment.
Graceful Degradation Design – AI systems are designed to degrade gracefully. When confidence is low, outputs are appropriately caveated or escalated to human review. Systems do not silently produce low-confidence outputs as if they were high-confidence ones.
Failure Detection Instrumentation – Production AI systems are instrumented to detect the onset of known failure modes: confidence distribution shifts, output anomalies, upstream data quality drops, and feedback signal degradation. Detection happens before user impact, not after.
Defined Incident Responses – For each major failure mode category, we define the response: who is notified, what action is taken, how quickly, and who has authority to take the system offline if required. Incident response for AI is planned in advance, not improvised under pressure.
Red Team and Adversarial Testing – Before deployment, AI systems are subjected to adversarial testing by people actively trying to break, mislead, or manipulate the system. Adversarial testing surfaces failure modes that standard evaluation misses.
Post-Incident Learning – When AI systems fail in production, we conduct blameless post-incident reviews that update our failure mode taxonomy, improve detection mechanisms, and inform the design of future systems. Failure is treated as learning, not liability.

Why This Matters The organisations that suffer the most damaging AI incidents are not those that built the worst systems — they are those that built systems without honestly confronting how those systems could fail. Overconfidence in model performance, insufficient planning for edge cases, and the absence of graceful degradation mechanisms turn ordinary model limitations into major incidents. Treating failure as predictable and plannable is not pessimism — it is engineering discipline applied to an inherently probabilistic domain.

Our Expectation Every AI system in production has documented, reviewed failure mode analysis and a defined incident response plan. Teams that deploy AI without this preparation are not being bold — they are being negligent. Knowing how our systems can fail, and planning for it, is how we build AI that is genuinely Better — robust, trustworthy, and worthy of the confidence we place in it.