Imagine that two doctors in the same city give different diagnoses to identical patients — or that two judges in the same court give different sentences to people who have committed the same crime. Suppose that different food inspectors give different ratings to indistinguishable restaurants — or that when a company is handling customer complaints, the resolution depends on who happens to be handling the particular complaint. Now imagine that the same doctor, the same judge, the same inspector, or the same company official make different decisions, depending on whether it is morning or afternoon, or Monday rather than Wednesday. These are examples of noise: variability in judgments that should be identical.
In Noise: A Flaw in Human Judgement, Nobel Prize Winner, Daniel Kahneman together with co-authors Olivier Sibony, and Cass R. Sunstein show how noise helps produce errors in many fields, including medicine, law, public health, economic forecasting, food safety, forensic science, bail verdicts, child protection, strategy, performance reviews and personnel selection. And although noise can be found wherever people make judgments and decisions, individuals and organizations alike commonly ignore to its role in their judgments and in their actions. They show “noise neglect.” With a few simple remedies, people can reduce both noise and bias, and so make far better decisions.
In these interviews, I speak to Daniel Kahneman (winner of the 2002 Nobel Prize in Economic Sciences and the National Medal of Freedom in 2013) and Cass R. Sunstein (Robert Walmsley University Professor at Harvard, where he is founder and director of the Program on Behavioural Economics and Public Policy). We talk about how noise impacts our decision making, how judgements are made, and why we need think about making decisions, much like washing our hands.
Q: What is noise, and how does it relate to our decision making?
[Daniel Kahneman]: The easiest way to understand noise is by thinking about measurements. Suppose you are measuring a line with a very fine ruler. You will expect some variability such that you will not get the same number every time when you measure. That variability is noise. The average error is the bias of the ruler, and the bias of your measurement is the average of the error. The variability of the error is noise – and that’s important. If your bias is 0, if you are completely unbiased, but sometimes you exaggerate or underestimate the length of the line, that is error.
In the mathematics of accuracy, the expression for total error is very simple and quite compelling. It is bias-squared plus noise-squared. Bias and noise are both contributing to error and in that equation, they do so on the same basis. That is the equation by which we think about accuracy in science and statistics.
When I measure a line, and measure it many times, no two measurements are going to be precisely identical – but the level of noise when I average measurement is being driven to zero. When I have enough independent measurements of the same line, there is no noise left, only bias. Furthermore, noise has to do with particular situations where the goal is to maximise accuracy. If you’re trying to be accurate, then noise is a problem – equivalent to bias.
When we learn and teach elementary statistics, we learn about reliability, and we know that reliability is a problem. Anything that is a matter of judgement, by definition, will not result in perfect agreement. Some noise is expected. I remember a particular consulting engagement I had in which I was exploring the idea of how much variability exists when people judge the same thing – underwriters judging risk. I found a lot more variability than expected, more than 5x the amount of variability the company executives expected. When I became conscious of the equivalence of noise and bias in the basic mean-squared equation, it hit me that this was a big issue and worth thinking about.
[Cass Sunstein]: Let’s suppose you wake-up in the morning and go on your weighing scales. They may show to be a little heavier than you really are. A few days later, they may show you as being a little lighter. That’s a noisy scale – it exhibits a degree of variability (judgement) on something which shouldn’t have that variability.
If your scale is showing you weigh a little more than you actually do half the time, and a little less than you do half the time, that’s not good. That’s a bad thing. If there’s someone in your company who is unrealistically optimistic about how things are going in the morning, and unreasonably pessimistic about how things are going in the afternoon, that’s not good. Noise is unwanted variability. If you have differences in judgement with respect to whether Taylor Swift is better than another artist, or how good Stephen King is? That’s not noise, that’s just a difference of opinion.
You can think of the human mind as a measuring instrument. We’re making judgements all the time and studies show that on a day-to-day basis, when presented with the same evidence, our judgements may be different. The same is true when we also observe measurements and judgements with groups of experts (systemic noise). If you see the mind as a measuring instrument, you start to see it as a scale, a bafflingly variable and noisy scale.
Q: Is there any evolutionary basis to our brain’s understanding of noise?
[Daniel Kahneman]: Variability is certainly a part of evolution; in fact, it is the engine of evolution. Variability brings evolution where there is feedback and selection.
We talk about noise when you have different people looking at the same object when they’re expected to reach the same conclusion. Think of underwriters looking at risk, or judges looking at a case. In those instances- it’s really difficult to think of an evolutionary purpose for noise.
Noise is everywhere, it’s a biological phenomenon. We’re not perfectly stable – your signature changes from occasion to occasion as your hand trembles. We have evidence that the efficiency of the brain fluctuates from moment to moment, quite substantially, and those fluctuations are a source of noise. The major source of noise however is that people look at the world and see it very differently to each other. Each person is trapped in their own world and assume that the rest of humanity sees the world the same way that they do. That assumption is incorrect. You see the world differently to me, and that’s true for every situation in which we find ourselves. It’s a side-effect of how we’re built!
Q: Why does noise matter?
[Cass Sunstein]: It’s important to make a distinction between bias and noise. If your scale always shows you as a little heavier than you are, that’s bias. If you have a firm where the executives are unrealistically optimistic all the time, that’s bias. If you have someone who grades students more severely than they ought to, that’s bias. Bias is a systemic tendency that deviates from where we should be. Noise is unwanted variability in measurements that ought to be the same – for example, in a hospital it may be whether or not a patient has cancer… in the justice system, it may be the fact that equivalent defendants are treated radically differently based on which judge is in the chair.
If we look at medicine specifically, there’s a lot of noise there. We found significantly more noise than expected and had to cross-check our findings with doctors and academics who confirmed our findings were right. The basic finding is that with respect to the diagnoses – whether heart disease, cancer, depression, anxiety and endometriosis- there was a lot of variability, a lot of noise. These errors add up. If you have a doctor who is very ready to find people as having heart disease, and thus overdiagnoses and overtreats them – that’s a mistake, risks safety and wastes resources. Similarly, if you have a doctor down the hall who underdiagnoses, that endangers lives. In the criminal justice system, we also saw variability between judges in the severity of their sentencing for equivalent crimes. Judges may issue the right sentence by the agreed upon guidelines, but the increased severity or leniency created variability that adds up. Even in business, we saw the same. You may have a company where someone in hiring says, ‘bring everyone in, I think they’re great’ and you end up with a company full of people who shouldn’t be there – the opposite, ‘nobody is good enough, bring nobody in…’ denies employment opportunities to people who really should be there. We find that in basically every domain we study, wherever there’s judgement there’s noise, and more than you think.
Q: Does noise influence algorithms?
[Daniel Kahneman]: Here you see the major difference between noise and bias. An algorithm, as any other rule, is noise free. That’s a characteristic of algorithms and rules – when you present the same problem twice, you will get the same answer.
It’s a characteristic of noisy judgement however, that when you present the same problem twice to two equivalent people, you will get different answers.
[Cass Sunstein]: Algorithms can be biased, but they won’t be noisy. They’re designed to spit-out the same answer every time. That’s a big advantage – they lead to significant noise reduction. However, they may or may not be good with respect to bias. If you had an algorithm making judgement on whether people could vote or drink, it would be a nightmare world.
Whether you get to vote, at least in systems that are working well, are not a result of the ad-hoc judgement of the voting official, but the product of rules that are reasonable and simple. The basic idea is that if you are of a certain age, and meet the basic criteria (for example, being a citizen, and perhaps not being a felon) you get a vote. The same rule applies- in some nations- about whether you receive disability payments. It’s not based on the judgement of an official who is deciding ‘how’ disabled’ you are, it’s based on rules based on what capacities remain and what opportunities are available in your geography. I use examples like voting and disability because they’re pretty close to algorithms.
An algorithm could be used therefore to decide whether someone has heart disease. If it’s based on the right inputs, it won’t be noisy (by definition) and may thus avoid biases that infect human judgement.
We have a lot of phobias around algorithms. Sometimes this is justified, but in the main, it’s like being afraid of cockroaches or spiders. Algorithms aren’t spiders or cockroaches, they’re an instrument and sometimes will outperform human judgement terrifically well – and sometimes won’t. When they don’t, instead of saying oh my god, a cockroach! We ought to examine why those algorithms aren’t working, and whether they can be improved or if they must be abandoned. We may decide the algorithm may be an advisor rather than judge… we may open a whole new area of analysis and research… we may decide we need better inputs such that the outputs are not biased around gender, race or geography… If lives are on the line and it turns out an algorithm reduces the noise of the human decision maker and the bias, then the moral case for using the algorithm starts to look really strong.
Q: How should we see noise as it relates to our own ignorance, or bias?
[Cass Sunstein]: In many domains, there are biases. Over the last 30 years, bias has received a great deal of attention. They may be cognitive biases such as unrealistic optimism, or biases like discrimination on the basis of gender or skin colour. Then there’s noise, unwanted variability. You could have a firm where half the time people discriminate against women, and half the time people discriminate against men. On average you may get the right distribution, but you get a lot of mistakes and unfairness on both sides – that’s noise.
Objective ignorance in this sense relates to the fact that predictions by nature can go wrong because the world turns up stuff that could not be anticipated. If someone says, with respect to my 12-year-old son, that at the age of 25 he’s going to be doing X or Y, good luck with that! The number of factors that play a role in what he will be doing at 25 are hard to anticipate in advance, and the prediction is going to be very, very unreliable. That’s not necessarily because of bias. It could be infected by bias – but it’s not the same. Understanding this is freeing and humbling, it suggests that life has serendipity – and this is a joyful thing.
If we want to get clarity on how things will unfold over 5 years or 10 years in many domains, it will be impossible. The valley of the normal suggests that when things happen, they’re often the things we could not have predicted- yet which are not surprising. A wide range of things that happen in our own lives are unexpected, yet unsurprising. That reveals something about our ability of hindsight – to generate a narrative or story which makes sense, even if – before the fact – we would have seen that outcome as surprising.
Q: How can we reduce noise, or improve our decision making?
[Daniel Kahneman]: Decision hygiene is a deliberately off-putting term. It’s like washing your hands. It’s a set of procedures that you apply – you don’t know what germs you are killing, and if you wash your hands successfully, you never will.
There are procedures that are designed- and almost guaranteed- to reduce noise, and several of them will reduce bias. For example, independent judgement of the same problem is guaranteed to drive down noise- we even know how much it will reduce mathematically (whilst leaving bias intact). Other procedures will reduce both noise and bias.
We’re really optimistic about decision hygiene as a concept!
We know- for example- that structured interview procedures lead to better hiring and that structuring is actually a major element of decision hygiene. In decision making more broadly, the same principle applies. You can think of each decision as a ‘candidate’ and thus a structured process is needed to improve accuracy and quality way beyond unstructured intuition.
[Cass Sunstein]: Decision hygiene is something which took us some time to come-up with. If you have a bias, you can find something like a medication. If you have a sore throat, we know what medication to give you. Hygiene protects you against an assortment of germs – and you will likely not know what you are protecting against, but rather that you have provided protection. In the Covid-19 era, handwashing has become a particular example. Handwashing is good even without Covid-19, it protects you against a lot of pathogens.
We have created a set of decision hygiene strategies which we hope are intuitive enough to gain widespread use. My co-author Danny Kahneman says, ‘if you want good advice, find a friend who likes you, but doesn’t care about your feelings…’ – that’s brilliant. It’s connected with decision hygiene because the friend has to care about your wellbeing to like you, but if they care too much about your feelings, they may give you an answer you will like rather than one you need to hear. The decision hygiene idea is to get a range of judgements, not just one, such that the majority or average reduces the noise and bias. Doctors know this. It’s not unusual to get a second, third or fourth opinion within a hospital – it’s essential to avoid unwanted variability and errors as well as biases. So one idea is to get a set of independent judgements and average them or take the majority. That helps.
Another idea is to use guidelines. In many countries, there’s something called an APGAR score. When a child is born, the score measures a child’s health along 5-dimensions. That guideline is very simple, reduces unwanted variability and bias when it comes to judging the health of new-borns. When judging whether or not someone has a disease, when judging whether or not someone should get a visa, or gain asylum, we need guidelines to discipline judgements.
We sometimes also need to delay our intuition and insert categories that mediate judgements. These categories need quantifiable scores and can aid decision making – particularly in areas like hiring. This can cut errors, noise and bias.
Q: What is the difference between thinking and judgement?
[Daniel Kahneman]: Not all thinking involves judgement. Any thinking that is rule governed, any computation does not require judgement. The essence of judgement is that there is a possibility for reasonable disagreement. When we say a topic is a matter of judgement, we are assuming that we are combining or integrating complex information into a single judgement around a decision, we are therefore presuming the possibility of reasonable disagreement. It turns-out that this integration of information is where people are very different from each other. They are different in the weights that they give to different items of information and that creates the noise we observe.