Thinking Fast and Slow — Summary

Abdullah Mobeen
14 min readAug 1, 2019

In this post, I will be sharing the notes I made while reading Thinking Fast and Slow. I believe there is a lot we can learn about how humans make decisions by understanding the different biases and heuristics highlighted by Daniel Kahneman.

I have organized this summary into 4 broad categories, as it is done in the book — Two Systems, Judgment Under Uncertainty, Choices, and Two Selves.

1) TWO SYSTEMS

· System 1: Fast, automatic, frequent, emotional, stereotypic, unconscious. Examples (in order of complexity) of things system 1 can do:

  1. determine that an object is at a greater distance than another
  2. localize the source of a specific sound
  3. complete the phrase “war and …”
  4. display disgust when seeing a gruesome image
  5. solve 2+2=?
  6. read text on a billboard
  7. drive a car on an empty road
  8. come up with a good chess move (if you’re a chess master)
  9. understand simple sentences
  10. connect the description ‘quiet and structured person with an eye for details’ to a specific job

· System 2: Slow, effortful, infrequent, logical, calculating, conscious. Examples of things system 2 can do:

  1. brace yourself before the start of a sprint
  2. direct your attention towards the clowns at the circus
  3. direct your attention towards someone at a loud party
  4. lookout for the woman with the grey hair
  5. dig into your memory to recognize a sound
  6. sustain a higher than normal walking rate
  7. determine the appropriateness of a particular behavior in a social setting
  8. count the number of A’s in a certain text
  9. give someone your phone number
  10. park into a tight parking space
  11. determine the price/quality ratio of two washing machines
  12. determine the validity of a complex logical reasoning
  13. solve 17 × 24

Kahneman covers several experiments which purport to highlight the differences between these two thought systems and how they arrive at different results even given the same inputs. Terms and concepts include coherence, attention, laziness, association, jumping to conclusions, WYSIATI (What you see is all there is), and how one forms judgments. The System 1 vs. System 2 debate dives into the reasoning or lack thereof for human decision making, with big implications for many areas including law and market research.

2) JUDGMENT UNDER UNCERTAINTY: HEURISTICS AND BIASES (Properties and Biases of the TWO SYSTEMS):

· Main Question: How do people assess the probability of an uncertain event or the value of an uncertain quantity?

· Answer: People rely on a limited number of heuristic principles which reduce the complex task of assessing probabilities and predicting values to simpler judgmental operations. These heuristics are quite useful but sometimes they lead to severe errors!

· 3 Heuristics Employed to Assess Probabilities:

i. REPRESENTATIVENESS: For the probabilistic questions of the type: “What is the probability that object A belongs to object B?” or “What is the probability that B will generate A?” In such cases, humans evaluate probabilities by the degree to which A is representative of B. If A is highly representative of B, we judge the probability to be high and vice versa. For example, if Steve is described as a very shy person, invariably helpful but with little interest in people, and he needs order. Do you think Steve is a librarian or a farmer? In this case, many of us tend to judge the probability by how much Steve’s description is representative of a librarian or a farmer and concluding that he is a librarian, ignoring that there are many more farmers in the US than librarians and hence probabilistically Steve has more chances of being a farmer. This type of estimation leads to serious errors because similarity or representativeness is not influenced by several factors that should affect judgments of probability.

a) Insensitivity to Prior Probability of Outcomes:

i. Prior probability (base rate) does not affect similarity but should have a major effect on probability.

ii. People neglect prior probabilities when they assess probability by the degree of representativeness.

iii. This is a violation of Bayes’ Rule.

iv. In Steve’s case, the base rate could be % of the US population that is a librarian and % that is a farmer.

v. Always verify the quality of the evidence given — if you doubt the evidence, stay close to the prior probability or prior belief.

b) Insensitivity to Sample Size:

i. Always remember the law of large numbers (results get more reliable as the size of data increases).

ii. You have more chance of getting “surprising” results when the sample size is small since there is a high probability you will run into an anomaly.

c) The misconception of Chance:

i. People expect that a sequence of events generated by a random process will represent the essential characteristics of that process even when the sequence is short. For example, most people believe the sequence HTHTTH is more valid than HHHTT or HHHTH. Think of Gambler’s Fallacy.

ii. Therefore people expect the essential characteristics of the process will be represented not only globally but also locally, but that’s not always the case.

d) Insensitivity to Predictability:

i. If people predict solely in terms of the favourableness of the description, their prediction will be insensitive to the reliability of the evidence and the expected accuracy of the prediction.

ii. In normative statistical theory, when predictability is nil, the same prediction should be made for all data points/cases. If predictability is perfect, the values predicted will match the actual values.

iii. In general, the higher the predictability, the wider the range of predicted values.

iv. So always verify the predictability and that the evidence you have help you predict the probability.

e) The Illusion of Validity:

i. We estimate probabilities with little or no regard for the factors that limit predictive accuracy. For example, people expressing great confidence in the prediction that Steve is a librarian even when the description is unreliable or outdated.

ii. People are more confident in their predictions when the input shows some highly consistent patterns.

iii. Highly consistent patterns are most often observed when the input variables are highly redundant or correlated (multicollinearity).

iv. So always make the input variables independent.

f) Misconceptions of Regression:

i. If 10 students perform exceptionally well on average on the first test, their average performance is likely to go down during the second test.

ii. Regression toward the mean à suppose X and Y have the same distribution (e.g. test scores in test X and Y), if one selects individuals where their average X score deviates from the mean of X by k units, then the average of their Y scores will usually deviate from the mean of Y by less than k units.

iii. People fail to identify regression or do not expect regression in many contexts where it is bound to occur.

iv. When people identify regression, they often invent spurious causal explanations for it. This is why you see athletes perform relatively worse after performing exceptionally well and getting featured on the front of sports magazines because they are simply fluctuating around their mean and there is no causal relation, it’s just regression.

ii. AVAILABILITY: This is when people assess the frequency of an event by the ease with which instances or occurrences can be brought to mind. Availability is useful but is affected by factors that don’t influence the actual probability. Reliance on availability leads to the following biases:

a) Biases due to the Retrievability of Instances:

i. When the size of a class / or probability is judged by the availability of its instances. A class whose instances are easily retrieved will appear more numerous than a class of equal frequency whose instances are less retrievable.

ii. Think of how people overestimate the probability of a terrorist attack, this is because a single terrorist attack is disturbing enough to be retrieved easily from the memory.

b) Biases due to the Effectiveness of a Search Set:

i. Different tasks elicit different search sets e.g. are there more words that begin with ‘r’ or more words that have ‘r’ as the third alphabet? Just because we are more effective in searching the set in the former case, we assign it more probability.

c) Biases of Imagination:

i. If one has to assess the frequency of a class whose instances are not stored in the memory, one assesses the frequency by the ease with which the relevant instances can be constructed.

ii. BUT, ease of constructing instances does not always reflect their true probability.

d) Illusory Correlation:

i. The associative connections between events are strengthened when the events frequently occur.

iii. ADJUSTMENT and ANCHORING: People often make estimates by starting from an initial value that is adjusted to yield the final answer. Typically, the adjustments are insufficient! Different starting points yield different estimates, which are biased towards the initial value (Anchoring).

a) Insufficient Adjustment:

i. When the subject bases his estimate on the result of some incomplete computation.

b) Biases in the Evaluation of Conjunctive and Disjunctive Events:

i. People tend to overestimate the probability of conjunctive events and underestimate the probability of disjunctive events.

ii. Biases in the evaluation of compound events are particularly significant in planning. Development of a new product has a conjunctive character: for the undertaking to succeed, each of a series of events must occur, even when each of these events is very likely (say 0.9 probability or 90% chance of success), the overall probability of success can be quite low if the number of events is large. Think of how 0.9 x 0.9 x 0.9… keep diminishing.

iii. Evaluation of risk has a disjunctive character. A complex system (human body, nuclear reactor) will collapse if one of its components fail. Even when the likelihood of failure in each component is slight, the probability of an overall failure can high if many components are involved.

iv. Chainlike structure of events (conjunctions) leads to overestimation. Funnel-like structure of events (disjunction) leads to underestimation.

· In short: there are 3 heuristics that are employed in making judgements under uncertainty — REPRESENTATIVENESS (employed when people are asked to judge the probability that an object or event A belongs to class or process B), AVAILABILITY (employed when people are asked to assess the frequency of a class or the probability of a particular development), and ANCHOR ADJUSTMENT (employed in numerical prediction when a relevant value is available.

3) CHOICES, VALUES, AND FRAMES (The rational Econs vs The natural Humans)

· Bernoulli’s Error (Econ): Bernoulli’s expected utility method is flawed because it ignores the fact that utility depends on the history of one’s wealth, not only on present wealth.

· PROSPECT THEORY — Human (a bit mathy):

i. Describes the way people choose between probabilistic alternatives that involve RISK, where the probabilities of outcomes are uncertain. It states that people make choices based on the potential value of losses and gains rather than the outcome.

ii. It tries to model real-life choices than optimal decisions.

iii. 2 stages in the decision-making process:

a) EDITING: outcomes of a decision are ordered according to a certain heuristic. People decide which outcomes they consider equivalent, set a reference point, and then consider lesser outcomes as losses and greater ones as gains. This resolves framing effects, isolations effects, etc.

b) EVALUATION: people behave as if they would compute a value (utility) based on the potential outcomes and their respective probabilities, and then choose the alternative having a higher utility. V = ∑π(pi)v(xi) where V = expected utility, x = potential outcomes, p = respective probabilities, v = function that assigns value to an outcome, pi = probability weighting function.

Prospect Theory

iv. Losses hurt more than gains feel good (loss aversion) as you see the gradient in the graph above is steeper in the losses quadrant than the gradient in the gains quadrant.

v. pi is a probability weighting function and captures the ideas that “people tend to overreact to small probability (possibility effect) events but tend to underreact to large probabilities (certainty effect)”

· Endowment Effect:

i. People are more likely to retain an object they own than acquire that some object when they do not own it. So we increase the value of things we own more than it deserves.

ii. Losses loom larger than gains.

· The Fourfold Pattern:

i. Explains Human’s risk aversion and risk-seeking behavior under different conditions.

ii. Possibility Effect = “They know the risk of a gas explosion is minuscule, but they want it mitigated. It’s a possibility effect, and they want peace of mind”

iii. Certainty Effect = The psychological effect resulting from the reduction of probability from certain to probable. This is what makes people go with 100% chance of winning $300 instead of choosing a 90% chance of winning $450.

The Fourfold Pattern

Rare Events:

i. People overestimate the probabilities of unlikely events

ii. People overweight unlikely events in their decisions

iii. The probability of a rare event is most likely to be overestimated when the alternative is not fully specified.

iv. Denominator neglect = when we just focus on the denominator. For example, if we know that each year there are 10 airplane accidents, we understand it as a huge number but then we look at the denominator and realize that overall vehicle accidents are 500,000 so in a fraction the number of airplane accidents is minute. It’s equal to focusing on winning marbles and ignoring non-winning marbles in a bag full of marbles. It also explains why different ways of communicating risks vary so much in their effects.

v. Due to the denominator effect, low probability events are much more heavily weighted when described in terms of relative frequencies (how many) than when stated in more abstract terms of chances or probability (how likely)

vi. Never focus on a single scenario or you will overestimate its probability, try to set up specific alternatives and make the probabilities add up to 100%.

· Risk Policies:

i. It is costly to be risk-averse for gains and risk-seeking for losses.

ii. Narrow framing = a sequence of two simple decisions, considered separately

iii. Broad framing = a single comprehensive decision with all options

iv. A rational agent will engage in broad framing but humans are by nature narrow framers.

v. Think like a trader! You win a few you lose a few

vi. Don’t focus too much on the fluctuations since you are loss averse by nature and prone to making suboptimal decisions.

vii. Always have a risk policy that you can routinely apply whenever a relevant problem arises. Risk policy should aggregate decisions = Ensemble learning.

viii. Outside view (asking someone from outside with an objective view) and risk policy protect against two distinct biases:

a) Exaggerated optimism of planning fallacy (in entrepreneurs)

b) Exaggerated caution induced by loss aversion

ix. Goal = Combination of the outside view with a risk policy!

· Keeping Score:

i. Humans have mental accounts for different things. For example, different mental accounts for cash and credit however we must note that money is money.

ii. Disposition effect = hanging on losing bet just to avoid closing our mental accounts at a loss (irrational!). for example, you might hang to a stock that’s losing value and sell instead of the one that’s performing well.

iii. Sunk-Cost fallacy = decision to invest additional resources in a losing account, when better investments are available. This is what keeps people for too long in poor jobs, unhappy marriages, and unpromising research projects.

iv. People have stronger emotional reactions to an outcome that is produced by action than to the same outcome when it is produced by inaction.

v. Hindsight avoiding policy = risk-avoiding. Either be very thorough or completely casual when making a decision with long-term consequences.

· Reversals:

i. Decisions should be made in joint evaluations. Example of joint evaluation would be how much would you pay for a dinner set with 12 plates and 6 cups out of which 3 are broken and a diner set with just 12 plates all intact. Single evaluations would be looking at them without this comparison. People are more rational during joint evaluations.

ii. Preference reversal occurs because joint evaluation focuses attention on an aspect of the situation which was salient in a single evaluation.

iii. Joint evaluation changes the representation of issue

iv. Rationality is generally served by broader and more comprehensive frames and joint evaluation is broader than single evaluation.

v. It is often the case that when you broaden the frame, you reach more reasonable decisions.

vi. When you see cases in isolation, you are likely to be guided by an emotional reaction.

· Frames and Reality:

i. The fact that logically equivalent statements evoke different reactions makes it impossible for humans to be as reliably rational as Econs.

ii. A bad outcome is much more acceptable if it is framed as the cost of a lottery ticket that did not win than if it is simply described as losing a gamble.

iii. Losses evoke stronger negative feelings than costs

iv. People will more readily forgo a discount than pay a surcharge

v. Economically equivalent is not equal to emotionally equivalent

vi. Our preferences are almost always “frame-bound” rather than “reality-bound”

vii. Framing should not be viewed as an intervention that makes or distorts an underlying preference.

viii. The different frames evoke different mental accounts, and the significance of the loss depends on the account to which it is posted.

ix. Broader frames and inclusive accounts generally lead to more rational decisions.

x. Try to reframe problems by changing the reference point

xi. Charge the loss to your mental account of ‘general revenue’ — you will feel different.

4) TWO SELVES

· Remembering Self and Experiencing Self

· Remembering Self keeps the scores and makes the choices (makes memories)

· Experiencing Self does the living

· People often leave the choice to their remembering self, preferring to repeat the experiencing that left the better memory, although it involved more pain.

· An objective observer would favor the experiencing self — minimizing the pain

· 2 biases that influence the remembering self:

i. Duration Neglect: the duration of the procedure did not effect whatsoever on the ratings of total pain.

ii. Peak-End Rule: average experience evaluated by the average of the “high” moment.

· These two biases were proven through an experiment in which there were two groups. One immersed their hands for 60 seconds in very cold water, the other immersed their hands for 60 seconds in very cold water but for another 10 seconds a stream of warm water was run into the tank, minimizing the pain at the end. Even though there was more cumulative pain for the second group, people in the second group rated the amount of pain experienced significantly lower than the first group. This showed that humans are poor at integrals (adding) and good at averaging. So we judge an experience by the peaks and the end.

· In the diagram, Patient A described the experience more painful than what Patient B did, but you can see Patient B suffered more pain (area under).

· The remembering self’s neglect to duration, its exaggerated emphasis on peaks and ends, and its susceptibility to hindsight combine to yield distorted reflections of our experience.

· We should do a duration weighted conception of well-being as it treats all moments of life alike, memorable or not.

· Life as a Story:

i. “He is desperately trying to protect the narrative of a life of integrity, which is endangered by the latest episode”

ii. “The length to which he is willing to go for a one night encounter is a sign of total duration neglect”

· Experienced Well-Being:

i. “The objective of policy should be to reduce human suffering. We aim for a lower U-index in society. Dealing with depression and extreme poverty should be a priority”

ii. “The easiest way to increase happiness is to control your use of time. Can you find more time to do things you enjoy doing?”

iii. “Beyond the satiation level of income, you can buy more pleasurable experiences, but you will lose some of your ability to enjoy the less expensive ones”

· Thinking About Life:

i. “She thought that buying a fancy car would make her happier, but it turned out to be an error of affective forecasting

ii. “His car broke down on the way to work this morning and he’s in a bad mood. This is not a good day to ask him about his job satisfaction”

iii. “She looks quite cheerful most of the time, but when she is asked how happy she is she says she is very unhappy. The question must make her think of her recent divorce”

iv. “Buying a large house may not make us happier in the long-term. We could be suffering from a focusing illusion

CONCLUSION:

- Two Systems à System 1 and System 2

- Two Species à Econs and Humans

- Two Selves à Remembering Self and Experiencing Self

--

--