← Governance Operating Model

Technical Debt Management

The debt you know about is manageable. The debt you do not know about is dangerous.

Technical debt is not a failure. It is a financial metaphor for the cost of previous decisions. The problem is not having debt - it is having debt you cannot see, cannot prioritise, and cannot get investment to address. This covers how to manage it deliberately.

Ward Cunningham's Original Metaphor

Ward Cunningham coined the technical debt metaphor in 1992. His original formulation is worth revisiting because it has been distorted in the decades since.

Cunningham described debt as the cumulative cost of choices made in order to ship quickly. Just as financial debt allows you to spend money now and pay it back with interest later, technical debt allows you to ship software now and pay the cost of shortcuts taken with interest later - in the form of slower future development, more bugs, and harder maintenance.

The key insight in Cunningham's original metaphor is that taking on debt is sometimes the right decision. A startup that ships a minimal viable product with known compromises is making a rational trade-off: speed of learning now in exchange for rework later. The debt is deliberate, understood, and the return on it is clear.

What Cunningham was not describing - and what the metaphor now commonly covers - is the accidental accumulation of shortcuts taken without awareness, compromises made without intention, or simply the inevitable ageing of systems as the context around them changes. This inadvertent debt is harder to manage because it is often invisible until it becomes a crisis.

A Taxonomy of Technical Debt

Not all technical debt is the same. The Cunningham taxonomy, as extended by Martin Fowler, distinguishes debt across two axes.

Deliberate vs inadvertent. Deliberate debt is taken on knowingly. The team said "we know this is not the right approach but we need to ship." Inadvertent debt accumulates without intention - from poor practices, insufficient expertise, or simply the passage of time making previous good decisions obsolete.

Reckless vs prudent. Reckless debt comes from shortcuts taken without understanding the consequences. Prudent debt comes from informed decisions to accept a trade-off.

The combination gives four quadrants:

  • Deliberate and prudent - "we'll ship with this approach now because the learning is more valuable than the clean solution, and we have a plan to address it"
  • Deliberate and reckless - "we do not have time to do this properly" - no plan, no understanding of consequences
  • Inadvertent and prudent - "we now understand the better approach that we did not know at the time" - this is simply how software development works
  • Inadvertent and reckless - "what layering?" - this is the most dangerous category

The practical value of this taxonomy is that it helps you identify what type of debt you have, which informs how to address it. Deliberate prudent debt has a known plan. Inadvertent prudent debt needs a discovery phase before a plan. Reckless debt of either type is a cultural and practice problem as well as a technical one.

Building a Debt Inventory

You cannot manage what you cannot see. A debt inventory is a structured catalogue of known technical debt in your systems.

A debt inventory entry should capture:

  • What it is - a specific, concrete description of the debt. Not "legacy code" but "the payment processing service uses a direct database connection pattern with no connection pooling, causing intermittent timeout errors under load."
  • Where it is - the specific system, component, or area
  • Why it exists - the history behind it, if known
  • What it costs - the current impact on development speed, reliability, or maintenance burden
  • What it would take to address - a rough estimate of the work involved
  • Priority - relative to other debt items

Building the initial inventory requires time set aside for discovery. Engineers on the team know where the debt is - the question is getting it out of their heads and into a format that can be managed.

Regular Debt Discovery

Schedule time for engineers to surface new debt items. Bug bashes, architecture reviews, and quarterly debt workshops are all mechanisms for identifying debt that accumulates between explicit discovery exercises.

The important norm to establish is that surfacing debt is not a failure to be ashamed of - it is a service to the organisation. Engineers who feel they will be criticised for identifying debt will stop identifying it. The debt will still exist, it will just be invisible.

Making the Business Case for Debt Reduction

Technical debt is invisible to most stakeholders. They see its effects - slower feature delivery, more frequent bugs, higher operational costs - but rarely connect those effects to the underlying cause.

The engineering leader's job is to make the connection visible and to quantify it in terms stakeholders can act on.

The Developer Productivity Cost

Measure how much of your team's time is spent dealing with the consequences of technical debt - workarounds, bug fixes in legacy code, lengthy investigation required because the system is hard to understand. Even rough measurement reveals a significant proportion of capacity consumed by debt service.

If 30% of your engineering capacity is spent working around existing problems, then addressing those problems is worth more to your delivery capacity than hiring additional engineers.

The Reliability Cost

Debt that contributes to service reliability problems has a direct business cost. Calculate the cost of incidents that have roots in technical debt - downtime, customer impact, incident response time. This is often the most compelling business case because it translates directly to customer experience and revenue.

The Risk Cost

Some technical debt represents a security or compliance risk that has a probability-weighted cost. A known vulnerability in a deprecated dependency has a probability of exploitation and a cost if exploited. Quantifying that expected cost, however approximately, makes the case for remediation in terms stakeholders understand.

The Debt vs Features Negotiation

The conversation about whether to invest in debt reduction or feature delivery is one of the most common sources of tension between engineering and product leadership. It does not need to be adversarial, but it often becomes so when the negotiation lacks a shared framework.

The most productive frame is sustainability. Feature delivery produces value now. Debt reduction enables feature delivery in the future. An organisation that never invests in debt reduction will find its feature delivery capacity declining over time as interest payments consume more and more capacity.

A practical approach: establish a sustainable allocation for debt reduction as a proportion of capacity. 20% is a commonly cited figure, though the right proportion depends on the current debt level. This removes the per-sprint negotiation and makes debt investment a standing commitment rather than something that is perpetually deferred in favour of features.

Communicate the debt trend, not just the debt level. If your debt is growing faster than you are paying it down, the trajectory matters to stakeholders who care about future delivery capacity. Make that trend visible in regular reporting.

Stopping the Accumulation

Paying down existing debt while accumulating new debt faster than you repay it is a losing strategy. Debt management requires both remediation and prevention.

Definition of Done

Include debt-relevant practices in your definition of done. Code reviewed for maintainability, tests written, dependencies updated, documentation adequate. A definition of done that only covers functional correctness allows debt to accumulate with every feature delivery.

Refactoring as Standard Practice

Refactoring - improving the structure of code without changing its behaviour - should be a continuous activity, not a periodic project. Engineers should feel empowered and expected to improve the code around them as they work, not just add features on top of existing code regardless of its quality.

The Boy Scout Rule - leave the code better than you found it - is a useful heuristic. It does not eliminate debt, but it prevents the continuous deterioration that occurs when engineers only add to code rather than improving it.

Technical Fitness Criteria

Establish technical fitness criteria that new code must meet - coverage thresholds, complexity limits, dependency rules, performance baselines. Automated enforcement of these criteria in the CI pipeline prevents new debt from being merged before it can be reviewed and addressed.

What Not to Do

Big bang rewrites. The temptation to pay off debt through a complete rewrite is strong but almost always wrong. Rewrites take longer than estimated, carry their own significant risk, and typically accumulate new debt while the old system is still running alongside. Incremental improvement with strangler fig patterns almost always outperforms big bang rewrites in practice.

Debt sprints. Dedicated debt reduction sprints that remove debt work from the product backlog and concentrate it in a single sprint are better than nothing but worse than continuous investment. The debt accumulates between debt sprints, and debt sprints are frequently cancelled when product priorities shift.

Treating all debt equally. Not all debt has the same cost. Debt in hot paths - code that is changed frequently, that affects reliability, that is a source of bugs - is high interest debt. Debt in rarely-touched legacy areas may be low interest. Prioritise high-interest debt.