Model Registry Management | Engineering Practice

Practice : Model Registry Management

Purpose and Strategic Importance

A model registry is the central catalogue of all model versions, their metadata, and their deployment status across environments. Without one, teams lose track of which model version is in production, cannot reliably retrieve past versions for investigation or rollback, and have no single source of truth for the state of their AI portfolio. The model registry is to ML systems what a package registry is to software development — a foundational governance layer that makes versioning, deployment, and auditing tractable.

Model registries also support organisational governance by providing the visibility that compliance, risk management, and leadership functions need. When regulators ask "what version of the model made this decision?", the answer should be retrievable in seconds from the registry, not require investigation across multiple systems and the recall of individual engineers.

Description of the Practice

Maintains a centralised registry of all model versions with standardised metadata: training data version, training code version, evaluation metrics, deployment environment, and lifecycle stage.
Implements lifecycle stage management that tracks model versions through defined stages — staging, production, archived — with governance checkpoints for stage transitions.
Links model versions in the registry to associated artefacts: model files, training configurations, evaluation reports, and model cards.
Provides APIs and UI that enable teams to query, compare, and retrieve model versions programmatically and interactively, supporting automation and human oversight equally.
Enforces registry hygiene through regular archiving of superseded model versions and retention policies that balance storage costs with audit requirements.

How to Practise It (Playbook)

1. Getting Started

Deploy a model registry — MLflow Model Registry, Amazon SageMaker Model Registry, Vertex AI Model Registry, or a custom solution — appropriate to your infrastructure and scale.
Define the standard metadata that all model registrations must include, aligning metadata requirements with your governance and operational needs.
Register all current production models in the registry as a starting point, establishing the baseline catalogue and building team familiarity with the tooling.
Integrate registry promotion (from staging to production) into the deployment pipeline approval process, making the registry the authoritative record of deployment decisions.

2. Scaling and Maturing

Build automated population of registry metadata from training pipelines, so that model registration requires no manual steps beyond the training run itself.
Implement access controls on registry operations — particularly production promotion and archival — that enforce governance requirements without creating unnecessary friction.
Develop registry dashboards that give leadership and governance functions visibility into the AI portfolio: which models are in production, when they were last updated, and whether they are within their review date.
Integrate the registry with monitoring systems so that production performance metrics are associated with specific model versions, enabling analysis of how performance changes across versions.

3. Team Behaviours to Encourage

Register every model version that reaches evaluation, not just those that are deployed — the registry is the team's institutional memory of model development, not just a production deployment log.
Keep registry metadata current and accurate — a registry with outdated or missing metadata is worse than no registry, because it creates false confidence.
Use the registry as the first port of call when investigating production issues — the ability to identify exactly which model version was serving at the time of an incident is a core diagnostic capability.
Review the registry periodically to archive superseded versions and ensure the active model catalogue reflects the current operational reality.

4. Watch Out For…

Registries that store model artefacts but not the training provenance information needed to reproduce them, providing partial rather than complete governance coverage.
Stage management processes that are bypassed under delivery pressure, with models promoted directly to production without going through staging review.
Registry technical debt — large numbers of unarchived model versions, stale metadata, and broken artefact links — that accumulate over time and undermine the registry's utility.
Treating the registry as an MLOps tool for data scientists rather than a governance system that belongs to the whole organisation, limiting its visibility and utility.

5. Signals of Success

Any engineer on the team can retrieve the exact model version that is currently serving in production, and identify the version that was serving at any past point in time, in under a minute.
Registry metadata for production models is complete, current, and accurate — verified by periodic audits against actual deployment state.
All production model promotions are recorded in the registry with associated governance approvals, creating an auditable deployment history.
When a production incident occurs, the team can identify the relevant model version and retrieve its full artefacts and metadata from the registry without manual investigation.
The registry is used by governance and leadership functions to maintain oversight of the AI portfolio, not only by technical teams as a deployment tool.