Pick the Right Metric to Incentivize the Behavior You Want
In Good to Great, Jim Collins recounts the story of how Walgreens CEO Cork Walgreen grew the company to be the largest drug retailing chain in America. $1 invested in Walgreens in 1975 would skyrocket to $562 by 2000 under his leadership, easily beating the stock market index by over fifteen times. During this time, Walgreens executed on a key concept: to build the best, most convenient drugstore in America.
One problem Walgreens faced while executing this strategy was defining a core metric that would align the incentives of leaders within the company toward the concept of convenience. Traditionally, large retail chains, including Eckerd – Walgreens’s major competitor in the late 1900s – measured success based on profit per store. This metric, however, incentivized those companies to reduce the number of more expensive, convenient stores and to buy less convenient ones at cheaper and more remote locations.
As Collins explains, Walgreens smartly decided to change its core metric rather than accept the traditional one:
“Walgreens switched its focus from profit per store to profit per customer visit. Convenient locations are expensive, but by increasing profit per customer visit, Walgreens was able to increase convenience (nine stores in a mile!) and simultaneously increase profitability across its entire system. The standard metric of profit per store would have run contrary to the convenience concept. 1
The shift in metric played a huge role in aligning the company’s focus. No longer focusing on profit per store, Walgreens replaced its inconvenient locations with more convenient ones, buying up corner lots with multiple entries and exits for its customers. It would close down perfectly good stores to move them a block away if it could get a more convenient corner location. In cities, it would even cluster multiple stores in a single-mile radius so that customers only had to walk a few blocks to their nearest Walgreens.
In order to increase profit per customer visit, Walgreens invested in a number of services to optimize for the customer’s convenience once in the store. It built one-hour photo stops, drive-through pharmacies, and a network system called Intercom to electronically connect customer data from every Walgreens to a central database so that customers could, for instance, pick up prescription drugs from any Walgreens in the US as if it were their local pharmacy. 2
Walgreens’s competitors that used less effective metrics like profit per store and continued to buy cheap and remote store locations would eventually lose the drugstore battle. Eckerd sold to CVS in 2004. 3
How you frame your goal determines your focus
While the Walgreens story focused on how a metric affected a company, the importance of defining the right metric applies to individuals and teams as well.
As engineers, we often set metrics for ourselves and for our teams or are subject to metrics set by our managers or by our organizations. We tend to get good at problem-solving and figuring out how to optimize a metric once it’s been set, but the choice of the actual metric to focus on significantly influences the actions we take. The right metric aligns team efforts toward a common goal, that if achieved, increases the success of the product or organization; the wrong metric leads to efforts that might be ineffective or, even worse, counterproductive.
Here are a few examples of how different metrics change behavior:
hours worked per week vs productivity per week. I’ve been through a few phases at startups where expectations of 70-hour work weeks became the norm in the hope of shipping a product faster. Not once have I come out of the experience thinking that it was the right decision for the team or the company. The marginal productivity of each additional work hour drops precipitously once you reach numbers anywhere close to that ballpark. Average productivity per hour drops, errors and bug rates increase, difficult-to-measure costs of burnout and turnover often result, and the overtime is typically followed by an equal period of “undertime” as employees try to catch up with their lives. 4 Ultimately, the metric of hours worked per week is unsustainable, and a much more reasonable metric is something aligned with productivity per week, where productivity is measured based on your focus area and might be something related to product quality, site speed, or user growth.
average response times vs p95/p99 response time. Focusing on average response times for a service or website leads to a very different set of priorities than focusing on the 95th or 99th percentile of response times. To decrease the average, you’ll tend to focus more on general infrastructure improvements that can shave off milliseconds from all requests. To decrease the p95 or p99, you’ll tend to hunt down the worst-case behaviors in your system. In a product, it’s often important to focus on the 95th or 99th percentile of response times because they tend to reflect the experiences of your most active users – users who follow the most people, have the most activity, or have the most recommendations and who tend to be more computationally expensive to support.
bugs fixed vs bugs outstanding. A friend I know who used to work on Adobe quality assurance shared a story of how his team once rewarded developers for bugs fixed. Of course, this only incentivized them to be less rigorous about testing when building new features so that there would be more opportunities to fix easy bugs later to rack up points.
registered users vs growth rate of registered users and weekly active users vs weekly active rate by cohort. When growing the user base of a product, it’s tempting to look at gross numbers of registered and weekly active users and be content with seeing those metrics move up and to the right. Unfortunately, those numbers don’t explain much in terms of whether you’re sustainabily increasing growth. On the other hand, measuring growth in terms of your growth rate (ratio of new registered users over total registered users) or in terms of how your weekly active rate trends by cohort (what fraction of users who signed up during week N are still active weekly?) provides much more actionable insight into how your product is performing. 5
Not only does defining the right metric itself matter, but defining the magnitude of the goal for that metric also matters. A goal of reducing website latency without a specific target will justify small, incremental improvements. But a goal of drastically reducing latency to below 200ms for a website that might currently take multiple seconds to render may necessitate cutting features, rearchitecting the system, or rewriting to a faster language. Small wins no longer make sense to tackle under the more aggressive goal.
The metric that you don’t set matters too
From 1999 to 2009, the e-commerce company Zappos grew from zero revenues to over $1 billion by the time it was acquired by Amazon. Its strong brand and differentiator lies in great customer service, and it’s reflected internally in their top company value of “Deliver WOW Through Service.” In Delivering Happiness, CEO Tony Hsieh explains that one way in which the company made sure customer service representatives internalized this value was by not measuring how long service calls took:
“Most call centers measure their employees’ performance based on what’s known in the industry as “average handle time,” which focuses on how many phone calls each rep can take in a day. This translates into reps worrying about how quickly they can get a customer off the phone, which in our eyes is not delivering great customer service… At Zappos, we don’t measure call times (our longest phone call was almost six hours long!), and we don’t upsell. We just care about whether the rep goes above and beyond for every customer.” 6
Setting the right metric for your organization, your team, or yourself is therefore incredibly important. In the context of companies, Collins called this metric the economic denominator, framing the question as:
“If you could pick one and only one ratio – profit per x … – to systematically increase over time, what x would have the greatest and most sustainable impact on your economic engine?” 7
Similarly, when setting metrics for ourselves, we should ask ourselves: If you could pick one and only one metric to systematically increase over time, what metric would have the greatest and most sustainable impact on you and your team’s effectiveness?
Jim Collins, Good to Great: Why Some Companies Make the Leap…And Others Don’t, p104-105. ↩
Good to Great, p148. ↩
“Eckerd Corporation”, Wikipedia. ↩
Tom DeMarco and Timothy Lister, Peopleware: Productive Projects and Teams (Second Edition) p15, p179. ↩
This is also what Eric Ries refers to as the difference between vanity metrics and actionable metrics in The Lean Startup. ↩
Tony Hsieh, Delivering Happiness: A Path to Profits, Passion, and Purpose, p145-146. ↩
Good to Great, p104. ↩
“A comprehensive tour of our industry's collective wisdom written with clarity.”
— Jack Heart, Engineering Manager at Asana
“Edmond managed to distill his decade of engineering experience into crystal-clear best practices.”
— Daniel Peng, Senior Staff Engineer at Google
“A comprehensive tour of our industry's collective wisdom written with clarity.”
— Jack Heart, Engineering Manager at Asana
“Edmond managed to distill his decade of engineering experience into crystal-clear best practices.”
— Daniel Peng, Senior Staff Engineer at Google
Leave a Comment