Ragan McGill

Policy : Make AI Behaviour Auditable and Traceable

Commitment to AI Auditability and Traceability Every consequential decision an AI system makes or informs should be traceable — to the input that produced it, the model version that processed it, the data that model was trained on, and the output that resulted. Without this traceability, AI systems are black boxes: they produce outputs that affect people, and when those outputs are questioned, investigated, or challenged, there is no record to examine. Our commitment is to build AI systems with auditability as a first-class design requirement — logging AI behaviour in a way that supports accountability, enables investigation, and meets the regulatory expectations that are increasingly codified in law.

What This Means Auditability means maintaining a durable, structured record of AI system behaviour that is sufficient for post-hoc investigation of specific decisions, aggregate analysis of system performance, and demonstration of compliance with applicable regulatory requirements. It means designing logging infrastructure as part of the AI system architecture, not adding it later. And it means ensuring audit logs are protected, retained for appropriate periods, and accessible to the people who need them — while being protected from the people who should not have access.

Our commitment to making AI behaviour auditable and traceable is built on:

Input-Output Logging – Every AI inference request and response is logged with sufficient detail to reconstruct what happened: the input features or data provided to the model, the model version that processed the request, the output produced, the confidence or probability associated with the output, and the timestamp.
Model Version Traceability – AI decisions are linked to the specific model version that made them. Model versions are tracked in a model registry with full lineage — training data version, hyperparameters, evaluation results, and deployment history. When a decision is investigated, the exact model responsible can be identified.
Decision Context Preservation – For AI systems that support human decisions, the AI output presented to the human and the human's subsequent action are both logged. This captures the full decision context and allows analysis of how AI recommendations influenced human behaviour.
Audit Log Integrity – Audit logs are protected against modification and deletion. Immutable log storage, cryptographic integrity verification, and access controls on log management operations ensure that audit records can be trusted as accurate historical records.
Retention Policies Aligned to Risk – Log retention periods are defined based on the risk profile of the AI system, the regulatory requirements applicable to its domain, and the realistic investigation horizon for decisions made by the system. Higher-stakes systems retain logs for longer periods.
Audit Access Provisioning – Defined roles have access to AI audit logs: compliance teams, incident investigators, authorised regulators, and named system owners. Access is controlled, logged, and proportionate to the purpose. Audit access is not the same as operational access.
Audit Readiness Testing – AI systems are periodically tested for audit readiness: can a specific past decision be retrieved? Can the model version responsible be identified? Can the full decision context be reconstructed? Audit readiness that only exists in theory does not meet the standard.

Why This Matters Regulators across multiple jurisdictions are moving toward mandatory auditability requirements for consequential AI systems. Beyond regulatory compliance, auditability is the foundation of accountability — the mechanism by which the organisation can demonstrate that its AI systems behaved appropriately, and investigate and remedy cases where they did not. AI systems that cannot be audited are AI systems that cannot be trusted with decisions that matter. The ability to look back at what an AI system did, and why, is not a nice-to-have — it is a governance requirement for any system deployed at scale.

Our Expectation Every AI system that makes or informs material decisions has documented, tested audit logging that meets the retention and accessibility requirements of its risk tier. Teams that deploy AI systems without audit logging are not making architectural trade-offs — they are building systems that cannot be governed. Making AI behaviour auditable and traceable is how we ensure our AI systems are Safer and worthy of the trust placed in them.