← Talent & Performance

Talent Reviews & Calibration

Calibration makes 'good' consistent. Without it, every manager has a different standard.

Talent reviews exist to understand who needs what - development, challenge, support, or succession. Calibration exists to make manager judgements consistent. Both are undermined by politics, recency bias, and the loudest voice in the room.

A talent review that is not calibrated is a collection of individual manager opinions. That is not worthless - manager opinions contain real information. But it is incomplete, because different managers have different standards, different exposure to their reports' work, different vocabularies for describing performance and potential, and different degrees of political awareness that shape what they are willing to say in a group setting.

Calibration is the process by which those individual opinions are examined against a shared standard, tested with evidence, and made consistent. Without calibration, the outcome of a talent review depends as much on which manager is in the room and how confidently they advocate as on the actual performance of the engineers being discussed. That is unfair and inaccurate.

This section covers what talent reviews are for, how to run them well, how calibration actually works in practice, and what goes wrong when it does not.


What a Talent Review Is For

Talent reviews are frequently confused with ranking exercises. They are not ranking exercises. The purpose of a talent review is to build a shared picture across a management group of who in the organisation needs what - and to act on that picture.

The questions a well-run talent review answers:

Who is ready for more? Engineers who are performing well in their current role and have capacity for additional challenge, broader scope, or progression. If these people are not being given more, they are likely to leave.

Who needs support? Engineers who are struggling - with the technical demands of their role, with the scale of their responsibilities, with something outside work that is affecting their performance. These people need active support, not just observation.

Who is at risk of leaving? Engineers who may be looking for something they are not finding in the current role - challenge, recognition, progression, better management. Retention risk identified in a talent review can be acted on. Retention risk identified in a resignation letter cannot.

Who is ready to step up? Engineers who are approaching readiness for a more senior role or who are candidates for key positions. Succession planning requires knowing who these people are before the position is vacant.

Who needs a development investment? Engineers whose current capability level limits their impact, but who have the potential and motivation to grow with the right investment.

A talent review that only produces ratings has answered none of these questions. The output of a talent review should be a set of actions - who is being developed, how, by whom, with what goal, and on what timeline.


The 9-Box Model

The 9-box model - a 3x3 grid plotting performance (low-medium-high) against potential (low-medium-high) - is the most commonly used tool in talent reviews. It is also commonly misused, often to the point of being counterproductive.

What the 9-Box Is Trying to Do

The logic is sound in principle. Performance and potential are different things. An engineer who is performing exceptionally in their current role may or may not be ready for a more senior role - the two do not automatically correlate. An engineer who is relatively new and still developing may have significant potential that is not yet expressed in current-role performance. The 9-box is an attempt to hold both dimensions in view simultaneously.

Why It Is Frequently Misused

The performance axis is often calibrated. The potential axis almost never is. "High potential" means different things to different managers. Without a shared definition of what potential looks like and how it is assessed, the potential axis is primarily a reflection of who is liked and who is visible.

It becomes a ranking exercise. The point of the 9-box is to identify what each person needs, not to sort them into a hierarchy. When managers arrive at talent reviews focused on which box someone should be in rather than what they need, the conversation becomes a debate rather than an inquiry.

It confuses trajectory with current state. Someone can be rated "low performance, high potential" and that rating can be accurate and actionable - but it requires knowing why performance is currently low and whether potential is genuinely high or is a projection of what the manager hopes. Without that inquiry, the rating is a guess.

"Low potential" is treated as permanent. The potential assessment is a snapshot, not a verdict. Someone whose potential appears low in their current role in their current context may have significant potential in a different role or with different conditions. Using the talent review to effectively categorise people as "high" or "low" value permanently is a misuse of the tool.

What to Use Instead or Alongside

Rather than or alongside the 9-box, consider framing talent review conversations around:

What does this person need to succeed? This question works for anyone at any performance level. The answer drives action.

What is the next meaningful growth opportunity for this person? This is the development question. It prevents talent reviews from becoming purely evaluative.

What would change if this person left? This is the retention and succession question. It drives honest assessment of who is most critical and least replaceable.

What is our confidence in this assessment and why? This is the calibration question. It forces evidence-based discussion rather than impression-based advocacy.


How to Run a Calibration Session

Preparation

Calibration sessions fail when people arrive unprepared. Preparation requires:

Managers bring evidence, not impressions. For each person discussed, the manager should be able to cite specific examples: a project outcome, a peer feedback theme, a development area addressed, a significant contribution. "She is excellent" is not evidence. "She led the payments redesign, brought three other engineers up to speed on the new architecture, and the service has had zero P1 incidents since launch" is evidence.

A shared framework is agreed in advance. What does "high performance" mean at each level? What does "ready for progression" require? If the capability framework defines these, the calibration session uses the framework as the reference. If it does not, the calibration session needs to establish a shared vocabulary at the start, before individual discussions.

The format is designed to prevent the loudest voice winning. If the session format is "each manager presents their team, the group responds," the outcome will be dominated by confident advocates. Better formats include round-robin evidence sharing, challenge questions that must be answered for every person discussed, and explicit devil's advocate roles.

Facilitation

A calibration session requires active facilitation, not just a chair. The facilitator's job is to:

  • Ensure every claim is tested with evidence ("what's an example of that?")
  • Surface disagreement rather than let it be papered over ("that sounds different from what you said, how do you reconcile that?")
  • Prevent recency bias dominating ("that was three weeks ago - what about the eight months before?")
  • Protect against political advocacy ("I want to understand the performance case, not the relationship case")
  • Keep the conversation focused on the question "what does this person need?" rather than "what box do they go in?"

The Discussion Structure

For each person reviewed, the conversation should follow a consistent structure:

1. Current performance summary (two minutes). The manager summarises performance: what is going well, what is not, how this compares to the expectations for the role and level. Evidence is named, not just conclusions.

2. Challenge and calibration (five to eight minutes). Other managers or the facilitator ask questions: Have you seen this differently? What evidence are you basing that on? How does this compare to [someone else at the same level]? The manager defends or adjusts their view based on the discussion.

3. Potential and trajectory (three minutes). What is the direction of travel? Is performance improving, declining, or stable? What is the next growth opportunity? Is there succession relevance?

4. Action identification (two minutes). What is the specific action coming out of this discussion? Who owns it? By when?

No discussion should take more than fifteen minutes. If it is taking longer, the discussion is probably not calibrating - it is debating a conclusion rather than examining evidence.

Output Format

The output of a calibration session should be:

Name Level Performance Summary Key Development Need Action Owner Timeline
[Name] [L3] [One sentence] [Specific gap or opportunity] [Specific action] [Manager] [Date]

This is not a ranking table. It is an action log. The point is not the summary - it is the action. A talent review that produces a completed 9-box and no actions has not achieved its purpose.


Succession Planning

Succession planning is the practice of identifying who would fill critical roles if those roles became vacant - and investing in developing those people so that they can step up when needed. Most engineering organisations do not do this at all until a key person announces they are leaving, at which point it is too late to have planned for it.

Identifying Key Roles

Not all roles are equally critical from a succession perspective. The roles that warrant succession planning are those where:

  • The person in the role has knowledge or relationships that are not documented or transferable quickly
  • The impact of the role is disproportionate to the number of people in it
  • The market for people with the relevant skills is highly competitive

A staff engineer who is the de facto architect of a core platform and has never documented their decisions is a succession risk. A principal engineer who is the primary relationship holder with a critical external partner is a succession risk. A senior engineer who is technically strong but highly replaceable is less so.

Identifying Potential Successors

For each key role, the talent review should identify:

  • Ready now: Someone who could step into the role within 90 days with modest support
  • Ready with development: Someone who could be ready within 12-18 months with targeted development
  • Future potential: Someone earlier in their trajectory who might be a longer-term candidate

The objective is not to lock in choices - circumstances change, people leave, trajectories change. The objective is to know that there is a development investment being made in potential successors, so that the organisation is not starting from zero when a key role becomes vacant.

The Succession Conversation With the Individual

Succession planning works best when it is not kept secret from the potential successor. "I see you as a potential future lead for the platform team" is a motivating and honest signal. It gives the person context for their development, helps them understand why certain opportunities are being created for them, and opens a conversation about whether the direction aligns with their own aspirations.

The risk of keeping succession planning entirely internal is that the person identified as a potential successor receives no signal, makes their own development choices that diverge from the planned path, and leaves before the succession is relevant.


Retention Risk Identification

The most preventable talent loss is the engineer who leaves because the organisation did not notice they were at risk until they handed in their notice. Retention risk, once visible, is usually actionable. It is most commonly invisible because nobody is looking for it.

Signals That Someone Is at Retention Risk

These signals are observable in the normal course of management. They require attention to notice and confidence to name.

Disengagement signals. Reduced contribution in team discussions. Less visible enthusiasm for work. Shorter, less detailed responses in 1:1s. Withdrawing from relationships with colleagues. Doing the minimum rather than the characteristic maximum.

External signals. Mentioning external events, networking activities, or industry conversations they would not previously have referenced. Updating a LinkedIn profile. Taking calls at unusual times. Being less willing to commit to long-term projects or plans.

Satisfaction deficits. Expressing frustration with specific aspects of the role or organisation that have not changed despite earlier discussions. Repeating themes about progression, recognition, management, or work content that suggest unmet needs.

Changed relationship with their manager. If the relationship between an engineer and their manager has deteriorated - through a difficult performance conversation, a perceived unfair treatment, or simply a mismatch of style - the engineer may be risk even if their performance is strong.

Acting on Retention Risk

Identifying retention risk in a talent review is only useful if action follows. The action depends on the source of the risk:

  • Insufficient challenge: Create or accelerate a development opportunity, extend scope, involve them in higher-visibility work
  • Insufficient recognition: Ensure their contribution is visible at the right level; consider formal recognition; make the progression path clear and achievable
  • Insufficient progression: Have an honest conversation about what the path forward looks like and on what timeline; if the path is genuinely blocked, be honest about that
  • Management relationship: Where the relationship between manager and engineer is the source of risk, consider whether a change in management is possible or whether structured support for the relationship is more appropriate
  • External factors: Sometimes the risk is personal - life circumstances, a partner's relocation, a pull to try something different. These are less amenable to organisational intervention, but a genuine conversation about what the organisation could do to retain them is still worthwhile

The retention conversation should be direct. "I want you to stay and I want to understand if there is something we can do differently" is a better starting point than hoping the engineer will raise it themselves. Most at-risk engineers will not raise it until they have already made the decision.


Making Calibration Consistent

Calibration is only as good as the shared reference point it uses. If the capability framework is the reference, the conversation is anchored in something consistent. If the reference is each manager's personal sense of what "good" looks like, calibration will reflect the range of personal standards in the room.

Using the Capability Framework

Every assessment in a talent review should be referenceable to the capability framework. "I'd describe this person as performing at the top of level 3, approaching level 4" is a calibratable statement. "She is really strong" is not. The capability framework provides the vocabulary that makes cross-manager comparison possible.

This requires that all managers in the calibration session have a shared understanding of what the capability framework describes at each level. If some managers are interpreting level 3 differently from others, the calibration will produce inconsistent results that look consistent because they are using the same words.

Shared Language

Before calibration sessions, agree on language:

Term What It Means for This Session
"Exceeds expectations" Consistently performing at the expectations for the level above, not just doing more
"High potential" Demonstrating capability for complexity and ambiguity beyond current scope, with evidence
"At risk" Assessed as more likely than not to leave within 12 months without intervention
"Ready for progression" Meeting the capability expectations of the next level in the majority of dimensions

Without this shared vocabulary, the same person can be described as "high performer" by one manager and "solid but not exceptional" by another, with no way to resolve the discrepancy.


Common Failures

Recency Bias

The engineer who had an excellent Q4 following three average quarters will be rated well. The engineer who had three excellent quarters followed by a difficult Q4 will be rated poorly. Neither rating accurately reflects twelve months of contribution. Recency bias is the dominant distortion in talent reviews, and it is structural - memory is recency-weighted and the evidence that managers bring to talent reviews reflects what they remember most vividly.

The fix is documented evidence across the full period. Managers who have written records of contributions, feedback, and development across the year are substantially less susceptible to recency bias than those who are recalling from memory.

The Halo Effect

The engineer who is excellent in a highly visible area - perhaps they were the lead on a successful high-profile delivery - will tend to be rated well across all dimensions, including ones where there is little evidence of their performance. The halo effect is the generalisation of a positive impression from one area to all areas. In a talent review, it produces inflated assessments that do not survive scrutiny.

The calibration challenge for halo effect: "That is a strong performance on the delivery. Can you tell me specifically how they performed on [some other dimension]?" If the answer is "I am not sure, but they did great on the delivery" - the assessment is incomplete.

Political Advocacy Masquerading as Objective Assessment

In some organisations, talent reviews become exercises in political positioning. Managers advocate loudly for their best people - not because those people are genuinely better than others but because the manager believes the process is competitive and they are trying to win resources, recognition, or support for their team. The result is that the talent review reflects the advocacy skills of managers, not the performance of engineers.

Calibration sessions should be designed to minimise advocacy by requiring evidence, by applying consistent challenge to all assessments regardless of who presents them, and by having the facilitator name advocacy when they see it.

The "High Potential" Label That Means Nothing

"High potential" is one of the most overused and underspecified labels in talent management. In many organisations, it is applied to people who are liked by their managers, to people who have been recently successful, and to people who are politically visible. It is rarely defined, rarely operationalised, and rarely connected to specific development investments. An engineer who is labelled "high potential" and then receives no development attention or opportunity has been given a label rather than an investment.

If "high potential" is going to mean something, it needs to be:

  • Defined in terms of observable signals (not just "senior manager believes in them")
  • Connected to specific development actions (not just noted in a spreadsheet)
  • Revisited regularly (not assigned permanently)
  • Communicated, at least in part, to the person themselves

Connection to Your Operating Model

Talent reviews and calibration connect to nearly every other element of the talent and performance system:

Performance management. The data for calibration comes from the ongoing performance conversations described in the Modern Performance Management section. If those conversations are not happening, the calibration session has no reliable input.

Career and capability framework. Calibration uses the framework as its reference standard. Without the framework, there is no shared definition of what "good" or "ready for progression" means.

PDPs. Talent review outputs drive PDP priorities. An engineer identified as having a specific capability gap should have that reflected in their next development plan.

Underperformance processes. Talent reviews are often where underperformance that has not been formally named is first discussed across the management group. This can be valuable - it surfaces patterns - but it needs to be followed by the manager having the direct conversation with the individual, not just noting it in a talent matrix.

Reward and recognition. Talent review outputs inform promotion and compensation decisions. The integrity of those decisions depends on the integrity of the calibration that precedes them.


This section connects directly to: Modern Performance Management, PDPs and PIPs, Addressing Underperformance, Reward and Recognition, and the Career and Capability framework.


Running Effective Talent Reviews

The talent review meeting is often the most politically charged conversation in the management calendar. The following guidance addresses how to structure the meeting to minimise politics and maximise useful outcomes.

Preparing Managers

Two weeks before the talent review, give each manager a template to complete for each of their direct reports. The template should ask:

Performance summary: Three to five specific examples of performance this period. Not impressions - examples. What was the outcome? What did this person contribute to it?

Strengths: Two or three specific technical or behavioural strengths with observable evidence.

Development areas: One or two specific development areas with evidence. Not character assessments - capability gaps with examples.

Development need assessment: What does this person need most right now? (Choose: more challenge, broader scope, targeted development support, direct performance support, retention conversation.)

Potential assessment: At what level could this person be operating within 18-24 months, and what evidence supports that?

Retention risk: Low, medium, or high, with brief rationale.

Managers who complete this preparation have evidence-based contributions to make. Managers who do not complete it should not be presenting to the talent review - their contribution without preparation will be impressionistic and will lower the quality of the discussion.

Meeting Structure

For a team of thirty engineers across four or five managers, a talent review should take three to four hours. For larger groups, break into domain-specific sessions followed by a cross-domain calibration.

Opening (15 minutes): Restate the purpose. Share the shared definitions (what "high performance" and "high potential" mean for this session). Name the expected output (action log, not a ranking).

Individual discussions (10-15 minutes each): Each manager presents. The group challenges and calibrates. The facilitator captures the action.

Patterns discussion (20-30 minutes): After all individuals have been discussed, step back. What patterns are visible? Are development needs concentrated in a particular team or domain? Are there succession gaps? Are there recognition gaps?

Action review (15 minutes): Review every action captured. Confirm owner and timeline. Any action without an owner does not exist.

Managing the Difficult Conversations in the Room

Calibration sessions produce disagreement. Two managers may have significantly different views of the same engineer. One manager may advocate strongly for a promotion the group does not support. A manager may be defending an assessment the group believes is inflated.

The facilitator's role in these moments:

When two managers disagree: "You have two different views. Let's hear the specific evidence behind each and see if we can reconcile them - or agree that this is a difference of perspective that we need to name." Do not resolve disagreement through compromise (splitting the difference) - resolve it through evidence.

When advocacy becomes excessive: "You've told us how much you value this person. Can you tell us specifically what they have done that meets the criteria we agreed at the start?" Redirect from advocacy to evidence.

When an assessment seems inflated: "The examples you've given are strong. I want to check against [specific criteria] - can you give me an example that demonstrates this in that context specifically?" Probe the corners rather than challenging the overall assessment directly.

When the discussion is going in circles: "We've been discussing this for a while. Let's note that there is genuine disagreement here and move on. [Facilitator] will follow up with both managers to understand this better before finalising the record."


Succession Planning in Engineering

Engineering succession planning has specific characteristics that generic succession frameworks do not account for. The concentration of domain knowledge in key engineers, the informal authority structures that exist alongside formal ones, and the technical complexity of core systems create succession risks that are not always visible to leadership.

Types of Succession Risk in Engineering

Key person risk - technical. One person understands a critical system, has written most of its code, and is the de facto decision-maker for its evolution. If they leave, the system becomes difficult to maintain and extend. The succession question here is both about finding a replacement and about reducing the concentration of knowledge in the first place.

Key person risk - relational. One person holds the relationship with a critical stakeholder, partner, or customer. Their departure would damage a relationship that the organisation depends on. Succession planning here involves making those relationships visible and distributed before the departure.

Leadership pipeline gaps. The organisation has a gap in the pipeline at a particular level - there are no credible internal candidates for the next wave of senior or staff engineering positions. This is a long-term investment problem; the fix is measured in years, not months.

Domain coverage gaps. A particular technical domain or architectural area has thin coverage. If two engineers who specialise in security architecture leave within six months of each other, the organisation is exposed. The succession question here is about coverage and redundancy.

The Knowledge Transfer Obligation

Engineers identified as succession risks should not be penalised for being identified. They should be engaged in the conversation: "Your knowledge of this system is critical. We want to work with you on how we reduce the single point of failure - partly for the organisation, but also because right now a lot depends on you in ways that limit your own development opportunities."

Structured knowledge transfer - through documentation, pair programming, architecture decision records, deliberate mentoring of successors - is development work. It should be recognised as such and made part of the identified engineer's role expectations, not added on top of existing commitments.


Reference: Talent Review Preparation Template

Use this template as the standard preparation requirement for all managers attending a talent review. Completing it before the session is mandatory. Presenting without completing it should be explicitly named as a preparation failure.


Engineer name and current level:

Performance summary - three examples:

  1. [Situation, what the engineer did, what the outcome was]
  2. [Situation, what the engineer did, what the outcome was]
  3. [Situation, what the engineer did, what the outcome was]

Primary strengths (specific, observable):

Development areas (specific, observable):

What does this person need most right now?

  • More challenge / broader scope
  • Targeted development support (specify area)
  • Direct performance support
  • Retention conversation
  • Succession development
  • Nothing additional - performing well, in good place

Potential assessment: Current level: [level] Assessed operating level in 18-24 months with appropriate development: [level] Evidence that supports this assessment: [specific examples of capability beyond current level]

Retention risk:

  • Low - no indicators of concern
  • Medium - some indicators present (specify)
  • High - active risk (specify, actions proposed)

Proposed actions: [What specifically should happen as a result of this discussion? Who owns it? By when?]


Completing this template for each direct report before the talent review is the difference between a calibration session and a conversation. Calibration requires data. Data requires preparation.


Reference: Calibration Consistency Checklist

Before finalising talent review outcomes, apply this checklist to identify inconsistencies that require revisiting:

Cross-team consistency:

  • Are engineers at the same level being held to consistent expectations across different managers?
  • Are the same capability framework criteria being used as the reference across all discussions?
  • Have examples been tested with the same level of scrutiny regardless of which manager presented them?

Bias checks:

  • Have any assessments been driven primarily by recency (last 6 weeks) rather than the full period?
  • Are any "high potential" assessments based primarily on visibility rather than demonstrated capability?
  • Have quiet, less visible contributors received the same depth of discussion as visible ones?
  • Are any assessments tracking demographic patterns that might signal systemic bias?

Action quality:

  • Does every person discussed have at least one specific action?
  • Does every action have a named owner and a timeline?
  • Are the actions distributed appropriately - not all falling to HR or to one manager?
  • Are the actions ambitious enough to actually address the identified needs?

A talent review that fails three or more of these checks should be reconvened before the outputs are finalised and shared. The cost of reconvening is substantially lower than the cost of acting on inaccurate assessments.


After the Talent Review: Making Actions Real

The most common failure of talent reviews is not in the session itself but in the follow-through. Actions are agreed, the meeting ends, and four months later - when the next talent review is due - the majority of actions have not been taken.

The reasons are predictable: managers leave the session with good intentions but no accountability structure for the actions. The actions live in a spreadsheet that nobody reviews between sessions. The ownership is diffuse ("manager will do something about this").

Making Actions Stick

Review talent review actions monthly. Each manager should have their talent review action list in a standing agenda item for their own 1:1 with their manager. "What is the status of your actions from the talent review?" is not a bureaucratic question - it is a quality assurance check on whether the organisation's investment in the talent review process is translating into actual development activity.

Make actions visible. The talent review action log should be accessible to all managers who participated, not filed and forgotten. Visibility creates social accountability even without explicit follow-up.

Assign a named owner for every action - always the manager of the engineer. Actions owned by "the team" or by "HR" without a named individual are actions that will not be completed. Every action should have one person who is responsible for ensuring it happens.

Set follow-up dates that are specific, not vague. "Manager will create a growth opportunity for this engineer" by a specific date is checkable. "Manager will explore options for development" is not.

Close the loop with the engineer. Where the talent review has identified that an engineer needs a specific development opportunity, retention conversation, or support, the manager should be having that conversation with the engineer within two weeks of the talent review. The talent review is an internal management process. Its value is zero unless it changes how managers interact with the engineers who were discussed.