Delliote Case Study
Delliote Case Study
Delliote Case Study
At Deloitte we’re redesigning our performance management system. This may not
surprise you. Like many other companies, we realize that our current process for
evaluating the work of our people—and then training them, promoting them, and
paying them accordingly—is increasingly out of step with our objectives.
What might surprise you, however, is what we’ll include in Deloitte’s new system
and what we won’t. It will have no cascading objectives, no once-a-year reviews,
and no 360-degree-feedback tools. We’ve arrived at a very different and much
simpler design for managing people’s performance. Its hallmarks are speed, agility,
one-size-fits-one, and constant learning, and it’s underpinned by a new way of
collecting reliable performance data. This system will make much more sense for
our talent-dependent business. But we might never have arrived at its design
without drawing on three pieces of evidence: a simple counting of hours, a review
of research in the science of ratings, and a carefully controlled study of our own
More than likely, the performance management system Deloitte has been using has
some characteristics in common with yours. Objectives are set for each of our
65,000-plus people at the beginning of the year; after a project is finished, each
person’s manager rates him or her on how well those objectives were met. The
manager also comments on where the person did or didn’t excel. These evaluations
are factored into a single year-end rating, arrived at in lengthy “consensus
meetings” at which groups of “counselors” discuss hundreds of people in light of
their peers.
Internal feedback demonstrates that our people like the predictability of this
process and the fact that because each person is assigned a counselor, he or she has
a representative at the consensus meetings. The vast majority of our people believe
the process is fair. We realize, however, that it’s no longer the best design for
Deloitte’s emerging needs: Once-a-year goals are too “batched” for a real-time
world, and conversations about year-end ratings are generally less valuable than
conversations conducted in the moment about actual performance.
But the need for change didn’t crystallize until we decided to count things.
Specifically, we tallied the number of hours the organization was spending on
performance management—and found that completing the forms, holding the
meetings, and creating the ratings consumed close to 2 million hours a year. As we
studied how those hours were spent, we realized that many of them were eaten up
by leaders’ discussions behind closed doors about the outcomes of the process. We
wondered if we could somehow shift our investment of time from talking to
ourselves about ratings to talking to our people about their performance and
careers—from a focus on the past to a focus on the future.
We also learned that the defining characteristic of the very best teams at Deloitte is
that they are strengths oriented. Their members feel that they are called upon to do
their best work every day. This discovery was not based on intuitive judgment or
gleaned from anecdotes and hearsay; rather, it was derived from an empirical study
of our own high-performing teams.
Our study built on previous research. Starting in the late 1990s, Gallup conducted a
multiyear examination of high-performing teams that eventually involved more
than 1.4 million employees, 50,000 teams, and 192 organizations. Gallup asked
both high- and lower-performing teams questions on numerous subjects, from
mission and purpose to pay and career opportunities, and isolated the questions on
which the high-performing teams strongly agreed and the rest did not. It found at
the beginning of the study that almost all the variation between high- and lower-
performing teams was explained by a very small group of items. The most
powerful one proved to be “At work, I have the opportunity to do what I do best
every day.” Business units whose employees chose “strongly agree” for this item
were 44% more likely to earn high customer satisfaction scores, 50% more likely
to have low employee turnover, and 38% more likely to be productive.
We set out to see whether those results held at Deloitte. First we identified 60 high-
performing teams, which involved 1,287 employees and represented all parts of the
organization. For the control group, we chose a representative sample of 1,954
employees. To measure the conditions within a team, we employed a six-item
survey. When the results were in and tallied, three items correlated best with high
performance for a team: “My coworkers are committed to doing quality work,”
“The mission of our company inspires me,” and “I have the chance to use my
strengths every day.” Of these, the third was the most powerful across the
All this evidence helped bring into focus the problem we were trying to solve with
our new design. We wanted to spend more time helping our people use their
strengths—in teams characterized by great clarity of purpose and expectations—
and we wanted a quick way to collect reliable and differentiated performance data.
With this in mind, we set to work.
Radical Redesign
At the end of every project (or once every quarter for long-term projects) we will
ask team leaders to respond to four future-focused statements about each team
member. We’ve refined the wording of these statements through successive tests,
and we know that at Deloitte they clearly highlight differences among individuals
and reliably measure performance. Here are the four:
2. Given what I know of this person’s performance, I would always want him or
her on my team [measures ability to work well with others on the same five-point
3. This person is at risk for low performance [identifies problems that might harm
the customer or the team on a yes-or-no basis].
In effect, we are asking our team leaders what they would do with each team
member rather than what they think of that individual. When we aggregate these
data points over a year, weighting each according to the duration of a given
project, we produce a rich stream of information for leaders’ discussions of what
they, in turn, will do—whether it’s a question of succession planning, development
paths, or performance-pattern analysis. Once a quarter the organization’s leaders
can use the new data to review a targeted subset of employees (those eligible for
promotion, for example, or those with critical skills) and can debate what actions
Deloitte might take to better develop that particular group. In this aggregation of
simple but powerful data points, we see the possibility of shifting our 2-million-
hour annual investment from talking about the ratings to talking about our people
—from ascertaining the facts of performance to considering what we should do in
response to those facts.
We ask leaders what they’d do with their team members, not what they
think of them.
Two objectives for our new system, then, were clear: We wanted to recognize
performance, and we had to be able to see it clearly. But all our research, all our
conversations with leaders on the topic of performance management, and all the
feedback from our people left us convinced that something was missing. Is
performance management at root more about “management” or about
“performance”? Put differently, although it may be great to be able to measure and
reward the performance you have, wouldn’t it be better still to be able to improve
1. The Criteria
2. The Rater
3. Testing
We then tested that our questions would produce useful data. Validity
testing focuses on their difficulty (as revealed by mean responses) and the
range of responses (as revealed by standard deviations). We knew that if
they consistently yielded a tight cluster of “strongly agree” responses, we
wouldn’t get the differentiation we were looking for. Construct validity
andcriterion-related validity are also important. (That is, the questions
should collectively test an underlying theory and make it possible to find
correlations with outcomes measured in other ways, such as engagement
4. Frequency
5. Transparency
We’re experimenting with this now. We want our snapshots to reveal the
real-time “truth” of what our team leaders think, yet our experience tells us
that if they know that team members will see every data point, they may be
tempted to sugarcoat the results to avoid difficult conversations. We know
that we’ll aggregate an individual’s snapshot scores into an annual
composite. But what, exactly, should we share at year’s end? We want to
err on the side of sharing more, not less—to aggregate snapshot scores
not only for client work but also for internal projects, along with
performance metrics such as hours and sales, in the context of a group of
peers—so that we can give our people the richest possible view of where
they stand. Time will tell how close to that ideal we can get.
Research into the practices of the best team leaders reveals that they conduct
regular check-ins with each team member about near-term work. These brief
conversations allow leaders to set expectations for the upcoming week, review
priorities, comment on recent work, and provide course correction, coaching, or
important new information. The conversations provide clarity regarding what is
expected of each team member and why, what great work looks like, and how each
can do his or her best work in the upcoming days—in other words, exactly the
trinity of purpose, expectations, and strengths that characterizes our best teams.
Our design calls for every team leader to check in with each team member once a
week. For us, these check-ins are not in addition to the work of a team leader;
they are the work of a team leader. If a leader checks in less often than once a
week, the team member’s priorities may become vague and aspirational, and the
leader can’t be as helpful—and the conversation will shift from coaching for near-
term work to giving feedback about past performance. In other words, the content
of these conversations will be a direct outcome of their frequency: If you want
people to talk about how to do their best work in the near future, they need to talk
often. And so far we have found in our testing a direct and measurable correlation
between the frequency of these conversations and the engagement of team
members. Very frequent check-ins (we might say radically frequent check-ins) are
a team leader’s killer app.
That said, team leaders have many demands on their time. We’ve learned that the
best way to ensure frequency is to have check-ins be initiated by the team member
—who more often than not is eager for the guidance and attention they provide—
rather than by the team leader.
To support both people in these conversations, our system will allow individual
members to understand and explore their strengths using a self-assessment tool and
then to present those strengths to their teammates, their team leader, and the rest of
the organization. Our reasoning is twofold. First, as we’ve seen, people’s strengths
generate their highest performance today and the greatest improvement in their
performance tomorrow, and so deserve to be a central focus. Second, if we want to
see frequent (weekly!) use of our system, we have to think of it as a consumer
technology—that is, designed to be simple, quick, and above all engaging to use.
Many of the successful consumer technologies of the past several years
(particularly social media) are sharing technologies, which suggests that most of us
are consistently interested in ourselves—our own insights, achievements, and
impact. So we want this new system to provide a place for people to explore and
share what is best about themselves.
This is where we are today: We’ve defined three objectives at the root of
performance management—to recognize, see, and fuel performance. We have three
interlocking rituals to support them—the annual compensation decision, the
quarterly or per-project performance snapshot, and the weekly check-in. And
we’ve shifted from a batched focus on the past to a continual focus on the future,
through regular evaluations and frequent check-ins. As we’ve tested each element
of this design with ever-larger groups across Deloitte, we’ve seen that the change
can be an evolution over time: Different business units can introduce a strengths
orientation first, then more-frequent conversations, then new ways of measuring,
and finally new software for monitoring performance. (See the exhibit
“Performance Intelligence.”)
But one issue has surfaced again and again during this work, and that’s the issue of
transparency. When an organization knows something about us, and that
knowledge is captured in a number, we often feel entitled to know it—to know
where we stand. We suspect that this issue will need its own radical answer.
It’s not the number we assign to a person; it’s the fact that there’s a single
In the first version of our design, we kept the results of performance snapshots
from the team member. We did this because we knew from the past that when an
evaluation is to be shared, the responses skew high—that is, they are sugarcoated.
Because we wanted to capture unfiltered assessments, we made the responses
private. We worried that otherwise we might end up destroying the very truth we
sought to reveal.
But what, in fact, is that truth? What do we see when we try to quantify a person?
In the world of sports, we have pages of statistics for each player; in medicine, a
three-page report each time we get blood work done; in psychometric evaluations,
a battery of tests and percentiles. At work, however, at least when it comes to
quantifying performance, we try to express the infinite variety and nuance of a
human being in a single number.
We haven’t resolved this issue yet, but here’s what we’re asking ourselves and
testing: What’s the most detailed view of you that we can gather and share? How
does that data support a conversation about your performance? How can we equip
our leaders to have insightful conversations? Our question now is notWhat is the
simplest view of you? but What is the richest?
Our question now is not What is the simplest view of you? But What is the
Over the past few years the debate about performance management has been
characterized as a debate about ratings—whether or not they are fair, and whether
or not they achieve their stated objectives. But perhaps the issue is different: not so
much that ratings fail to convey what the organization knows about each person
but that as presented, that knowledge is sadly one-dimensional. In the end, it’s not
the particular number we assign to a person that’s the problem; rather, it’s the fact
that there is a single number. Ratings are a distillation of the truth—and up until
now, one might argue, a necessary one. Yet we want our organizations to know us,
and we want to know ourselves at work, and that can’t be compressed into a single
number. We now have the technology to go from a small data version of our
people to a big data version of them. As we scale up our new approach across
Deloitte, that’s the problem we want to solve next.