Ever been asked to measure something that defies easy quantification–the bottom-line impact of development time, say, or the effect on morale of a new vacation policy–and responded to ambiguity by walking away? In the definite language of the tech sector, qualitative or less-than-totally-accurate measurements are often disregarded anyway, if they’re collected at all.
But add in Peter Drucker’s aphorism that “what gets measured gets managed,” and a problem emerges: that we tend to reward, and hence do, things that are easy to measure. An old Persian folk tale shines light, so to speak, on the situation:
Nasreddin had lost his ring. His wife found him looking for it in the yard and exclaimed, “You lost your ring in the living room! Why are you looking for it in out here?”
Nasreddin stroked his beard and said: “The room is too dark. I came out to the courtyard because there is much more light out here.”
Much as we yearn for easy, certain measurements, the same complex, tangled, fuzzy problems that are most prone to uncertainty are also home to the most interesting insights–and we’re well-served to invest in understanding them.
What we need are two things:
- a definition of “measurement” that allows for uncertainty
- confidence estimating, collecting, and applying them
Let’s take a look.
Measurement is an observation that reduces uncertainty.
This definition of measurement, offered by the management consultant Douglas Hubbard in his book “How to Measure Anything“, absolves the sin of imprecision and lets uncertainty in. Not “eliminates.” Reduces.
It’s an important distinction. Assuming that software should return at least 10x on the time invested in writing it, a measurement within an order of magnitude of the actual value is enough to make a go/no-go decision. Even with 10% certainty and huge error bars on either side, we can still make plans around the low forecast, the high forecast, and a best-guess down the middle. Ten percent is infinitely more than zero.
And we may not even need to measure to get there.
How many piano tuners are there in the city of Chicago?
The goal of this question was not a precise answer, but to give students practice working towards estimates that near the realm of possibility. An estimate accurate to within an order-of-magnitude (OOM, unfortunately) of its measured value can still inform better decisions about system performance, project timelines, or cost, and many estimates will be closer than that.
To increase confidence, it can be helpful to produce multiple estimates starting from different base assumptions. I once oversaw a complex enterprise implementation that the project team anticipated completing within four developer-months, based on the team’s previous experience and the relative (dollar-value) size of the client. A second estimate based on the complexity of projects within the implementation plan suggested a number closer to 60 developer-months–and gave us the warning we needed to adjust staffing and successfully deliver on time.
Getting in the habit of estimation serves two purposes. First, an estimate provides a falsifiable hypothesis for a measurement to test. Strong alignment between the expected and observed values provides at least some basis for trusting both, while wild deviations suggest a need for more measurement.
Our aversion towards qualitative, subjective, or imprecise measurements can also discourage us the sorts of instruments that produce them. For instance:
- ad-hoc employee surveys
- hallway usability tests
- interview Q&A
- working group notes
These data sources all share a certain “fuzziness” in that they’re drawn from subjective experience. Encoded in natural language, their data lack the inherent quantitative authority of a latency measure or the DORA metrics, say; still, they contain valuable signals on the state of the team, product, and business as a whole.
Don’t discard them out of hand.
When we do employ fuzzy sources, however, we tend to go out of our way to return them to the quantitative realm. Statistical analyses discard portions of the input in return for a veneer of quantitative legitimacy; after all, numbers are easier to compare and visualize than the range of human language.
If projecting qualitative data onto a number line inspires more confidence in the insights embedded within them, so be it. But data obtained from fuzzy sources contain data beyond the numbers themselves–and with less confidence than our trust in numbers would have us believe.
The world has a funny way of defying assertions about it. Drawing on a familiar example, the Pythagorean ideal of a 3-4-5 triangle fails if drawn with a pencil more than 0mm thick–reality is somewhat more complicated than a high school geometry exam.
How, then, can we trust any measurement at all?
Embedded in the definition of uncertainty (an observation that reduces uncertainty) is a clue. Even measurements obtained from automated systems are burdened by clock synchronization and the inevitable risk of failure. They may be more certain than the data obtained from a Developer Experience Survey, but never to the point of absolutes.
However a measurement was obtained, it’s safe to assume some degree of imprecision built in. Acknowledging this reality, digging into methods and assumptions, and assigning appropriate error bars are all important parts to applying a measurement safely.
Do that, though, and an “uncertain” measurement is the same as anything else.
As the German management consultant Helmut von Moltke (the elder) famously put it, “no measurement survives contact with reality.” Pretending otherwise, or shying away from a (doomed) measurement of any sufficiently complicated quantity, does a disservice to whatever decision the measurement was meant to inform.
Uncertainty is certain. All that’s left is to reduce it as much as we can.