Model Card Documentation | Engineering Practice

Practice : Model Card Documentation

Purpose and Strategic Importance

A model card is a concise, structured document that communicates what a model does, how it was trained, what it is and is not suited for, and what risks are associated with its use. Originally proposed by Google researchers, model cards have become a de facto standard for responsible AI transparency because they make the implicit explicit — forcing teams to articulate the assumptions, limitations, and potential harms that are often known but rarely written down.

Model cards serve multiple audiences simultaneously. Engineers use them to understand and maintain systems they did not build. Product managers use them to make informed decisions about deployment scope. Legal and compliance teams use them as evidence of due diligence. Downstream users and partners use them to understand whether a model is appropriate for their context. Regulators increasingly expect them as part of AI system documentation. A well-maintained model card is not overhead — it is essential infrastructure for operating AI systems responsibly.

Description of the Practice

Creates a structured model card for every AI system before deployment, covering model purpose, training data, evaluation methodology, performance metrics, intended use, out-of-scope uses, and known limitations.
Documents disaggregated performance metrics — model performance broken down by demographic subgroups, use context, and edge cases — not just aggregate accuracy statistics.
Maintains model cards as living documents, updated at every significant model change, retraining, or discovery of new limitations or risks.
Makes model cards accessible to all relevant stakeholders — internal teams, downstream users, and governance functions — not locked in a data science team's internal documentation system.
Includes explicit statements about what the model should not be used for, providing actionable guidance rather than vague disclaimers.

How to Practise It (Playbook)

1. Getting Started

Adopt or adapt an existing model card template — Google's original template, Hugging Face's model card format, or a sector-specific variant — to establish a consistent structure for your organisation.
Create a model card for your most important production model first, treating the exercise as a discovery process that surfaces limitations and assumptions that may not have been formally documented before.
Define who is responsible for authoring, reviewing, and maintaining model cards — typically the team that builds and owns the model — with clear accountability for keeping them current.
Establish model card completion as a mandatory gate in the deployment approval process, so that no model can be deployed without one.

2. Scaling and Maturing

Integrate model card templates into your MLOps tooling so that common fields — training data version, evaluation metrics, code version — are populated automatically from experiment tracking data.
Develop a model card review checklist that governance or peer reviewers use to verify completeness, accuracy, and honest treatment of limitations and risks.
Build a model card registry that provides a searchable index of all production models and their documentation, enabling portfolio-level visibility and supporting audit requirements.
Use model card data to drive portfolio governance conversations — comparing performance, risk, and use case coverage across the organisation's AI systems.

3. Team Behaviours to Encourage

Write model cards for an audience that does not have deep ML expertise — if only a data scientist can understand the card, it is not serving its governance purpose.
Be honest about limitations and failure modes — a model card that only presents a model's strengths is a marketing document, not a governance artefact.
Update model cards promptly when new limitations are discovered, rather than waiting for the next scheduled review — users and stakeholders need current information.
Review model cards as part of model handover processes when teams change, treating them as the primary documentation for understanding a system you did not build.

4. Watch Out For…

Model cards that are completed perfunctorily to satisfy a process requirement but contain no substantive information about limitations, risks, or appropriate use.
Aggregate performance metrics that look impressive but mask poor performance on important subgroups — model cards must include disaggregated evaluation results.
Cards that become outdated after the initial release and are never updated to reflect model changes, retraining, or newly discovered issues.
Using model cards as the only governance mechanism for high-risk AI systems — they are a necessary but not sufficient safeguard for systems that affect people significantly.

5. Signals of Success

Every production AI system has a current, complete model card that was authored by the team that built it and reviewed by governance or peer reviewers.
Model cards include honest, specific descriptions of limitations and failure modes — not vague disclaimers that provide no actionable guidance.
Downstream users and teams report that model cards help them make informed decisions about how and whether to use models in their own applications.
Model card updates are tracked alongside model versions, with a clear history of how documentation has evolved as the model changed.
Regulators or auditors reviewing AI systems can access model cards as part of the compliance evidence base without additional effort from the engineering team.