A common scenario faced by a data middle manager is to field questions from execs and peers about something that, in their perception, is intangible (i.e. it cannot be measured).
They cite various reasons as to why:
It would be too complicated
No one has ever done it before
We do not have any data on it
We cannot collect any data on it
Our industry is special (e.g. highly-regulated)
To succeed as a data middle manager, you will need tools to deal with this inevitable situation. Luckily for you, you just need one resource!
The Church of Measurement
The bible on how to measure anything is obviously a book called exactly that, and Douglas Hubbard is its prophet.
First published in 2007, How to Measure Anything: Finding the Value of “Intangibles” in Business should be mandatory reading for all aspiring middle managers. It is relentless. Every other page, there is a proclamation:
Anything can be measured
They will tell you it can’t be measured
It is an illusion
It’s their lack of imagination
If you can observe it, you can measure it
You gotta love it.
Merely mastering a few of the suggested approaches will greatly increase your ability to estimate ROI on data initiatives, measure what matters, and reduce uncertainty to the threshold of informed decision-making.
Speaking of ROI, this book sets a high bar. Within the first 32 pages (applicable to the 2nd edition at least), you get
A much more useful definition of measurement
The technique of decomposing
A powerful small sample size trick—Rule of Five
A set of liberating assumptions about measurement
It’s just amazing value. I’m of the opinion that most books can be a blog post or a journal article. They are padded with fluff, and maybe contain one good idea if you are lucky. The only book I can think of on top of my head that can be considered more useful is Thinking, Fast and Slow by Daniel Kahneman—that’s high praise.
Rethinking Measurement
The problem of ‘we cannot measure it, it’s intangible’ is actually a corollary to a pair of underlying false assumptions:
Measurement needs to be so precise that it removes all uncertainty from a decision, i.e. perfect information
Data needs to exist, in abundance, and be readily available
If you accept these assumptions—eliminating uncertainty with torrential amounts of data—it would be impossible to make even the simplest of decisions.
However, in life, this is rarely the case. You only need to reduce uncertainty to the threshold—i.e. the boundary over which you make a decision.
To give a somewhat simple example, assume a company sells bins in three sizes for a furniture—20, 30, and 40 cm. You want to buy one, so you measure your furniture; it will be a match for one of the sizes (they are made to match). You grab a ruler and measure. Would you care if the ruler reads 28 cm or 33 cm?
It’s clearly the 30 cm variant, given the possible options. You wouldn’t try other rulers to get an exact 30 cm measurement of your furniture. Perhaps you are not measuring it at the right spot. Maybe you are misreading the ruler. Could be that the ruler is not well-calibrated. Whatever the reason, you don’t need this much certainty. You can go ahead and order your bin.
To achieve a similar effect at work, educate your peers and redefine measurement so that it is
a quantitatively expressed reduction in uncertainty based on one or more observations.
This provides you with a solid foundation to challenge the first false assumption.
The Quartet of Liberation
The second false assumption revolves around the notions of data. You need to tear them down first to allow measurement-enlightened thinking:
Your problem is not as unique as you think.
You have more data than you think.
You need less data than you think.
An adequate amount of new data is more accessible than you think.
A pervasive reversal regarding the relationship between the amount of data you have and the amount of uncertainty it reduces plagues many execs:
Myth:
When you have a lot of uncertainty, you need a lot of data to tell you something useful.Fact:
If you have a lot of uncertainty now, you don’t need much data to reduce uncertainty significantly.
When you have a lot of certainty already, then you need a lot of data to reduce uncertainty significantly.
It makes sense if you think about it from the perspective of uncertainty reduction. If you don’t have any data—you haven’t measured anything—you have maximum uncertainty. Even a singular observation will greatly reduce your uncertainty.
You don’t know how many hours on average a customer operations employee spends doing a certain task a week. You ask the first such employee, and they tell you it’s roughly 10 hrs/w.
Before, you had no data—the average time spend on the task could be anywhere from 0 to 40 hrs/w. Now with a single observation, you are much more anchored: your expectation is around 10 hrs/w. It’s true (and quite likely) that this expectation will change wildly if you collect 1, 2, …, n
more samples until you hit a critical mass. Still, you made a giant leap w.r.t. reducing your uncertainty.
If you already have a lot of data, you have less uncertainty. To reduce your uncertainty even by a slight amount, you will need a lot of new data.
At a large enterprise, average time spend on a certain task by customer operations (sample size=10,000) is 17.2 +/- 0.3 hrs/w.
To further shrink the error (+/- 0.3 hours) around the 17.2 hrs/w average, given n = 10,000
, you need thousands of new data points.
Rule of Five
A ridiculously effective technique put forward by the author is the Rule of Five:
There is a 93.75% chance that the median of a population is between the smallest and largest values in any random sample of five from that population.
If you are like me, when you are first faced with a statement like this, you are of two-minds. On one hand, it sounds too good to be true. OTOH, there are bunch of rules like this. Plus, it’s about the median and not mean, which is less sensitive to min/max values of a distribution.
The rationale is as follows: in a sample of five, for the population median to be above or below what is observed in the sample, all five values need to be above or below the median. Given that one observation has 50% (0.5) probability for either outcome, the joint probability of observing such a scenario is
which is 6.25%. Meaning, roughly 94% of the time, the median will be contained within the five samples you just collected. Neat, huh?
For further small sample size analysis, the following table holds
For example, if you collected 11 samples instead of five, the population median min/max is contained between the 3rd smallest and largest values (with >90% chance).
Decomposition
Yet another way to counter the can’t be measured crowd is to master decomposition. The idea is to break down the ‘intangible’ metric into various subcomponents that are either readily available or easier to conceptualise.
Many measurements start by decomposing an uncertain variable into constituent parts to identify directly observable things that are easier to measure.
This class of estimation problems is named after the renowned physicist Enrico Fermi, aka the Pope of Physics, who was known for his quick back-of-the-envelope calculations. This type of thinking is also common in consultancies, where they are often faced with calculating odds of things that they know nothing about. But I digress…
The decomposition example in the book is the famous Fermi example asking his students to estimate the number of piano tuners in Chicago, and nudging them to estimate other things about pianos and tuners:
Population of Chicago (around 3M circa 1930-50s)
Average number of people in a household (2-3)
Percentage of households with tuned pianos (3-10%)
Frequency of tuning (once a year?)
How many pianos a tuner can tune in a day (4-5?)
How many days a year a tuner works (250?)
Depending on the choices of the ranges you pick in the parentheses, you obtain an estimation for the initial query.
The idea behind decomposition is that by disaggregating one big measurement into many smaller measurements, you obtain a host of benefits.
First, the smaller measurements are more likely to be feasible.
Second, by decomposing, you are more likely to identify the major sources of uncertainty. Then, you can focus/spend more money on measuring them to improve the amount of uncertainty reduction.
Third, Hubbard mentions a curious side effect of decomposition; namely that merely doing the act can sometimes resolves the need to collect more data:
Decomposition effect: The phenomenon that the decomposition itself often turns out to provide such a sufficient reduction in uncertainty that further observations are not required.
Putting It All Together
The ability to measure anything that might be thrown at you is an invaluable skill. Use the following four-step framework to lower your anxiety (and of others) and get measurin’:
Does it leave a trail of any kind?
Follow the trail and analyse the data you have. If it’s observable, it’s countable. Be creative and expand your imagination—what else happens alongside it? Decompose where applicable.
If the trail doesn’t already exist, can you observe it directly or at least a sample of it?
Utilise direct observation: look around, count, sample. Use the rule of five (or more) to go from we know nothing to median is likely to be between min/max of the small sample you just collected.
If it doesn’t appear to leave behind a detectable trail of any kind, and direct observation does not seem feasible without some additional aid, can you devise a way to begin to track it now?
If there is no trail, create one. Add a ‘tracer’ to it so that from now on, it will leave a trail. This counters the narrative of we don’t capture that kind of data. Start capturing it now.
If tracking the existing conditions does not suffice (with either existing or newly collected data), can the phenomenon be “forced” to occur under conditions that allow easier observation (i.e. an experiment)?
If it is not possible to add tracing, create conditions to observe it. Optimise for simplicity over complex designs.