Judgment under Uncertainty: Heuristics and Biases …

Judgment under Uncertainty: Heuristics and Biases Author(s): Amos Tversky and Daniel Kahneman Source: Science, New Series, Vol. 185, No. 4157, (Sep. 27, 1974), pp. 1124-1131 Published by: American Association for the Advancement of Science Stable URL: Accessed: 15/04/2008 14:50 Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at . JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at . Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We enable the scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that promotes the discovery and use of these resources. For more information about JSTOR, please contact support@.



Judgment under Uncertainty: Heuristics and Biases

Biases in judgments reveal some heuristics of thinking under uncertainty.

Amos Tversky and Daniel Kahneman

Many decisions are based on beliefs concerning the likelihood of uncertain events such as the outcome of an elec-

tion, the guilt of a defendant, or the future value of the dollar. These beliefs

are usually expressed in statements such as "I think that . .. ," "chances are

. . .," "it is unlikely that . .. ," and so forth. Occasionally, beliefs concerning uncertain events are expressed in numerical form as odds or subjective probabilities. What determines such beliefs? How do people assess the probability of an uncertain event or the value of an uncertain quantity? This article shows that people rely on a limited number of heuristic principles which reduce the complex tasks of assessing probabilities and predicting values to simpler judgmental operations. In general, these heuristics are quite useful, but sometimes they lead to severe and systematic errors.

The subjective assessment of probability resembles the subjective assessment of physical quantities such as distance or size. These judgments are all based on data of limited validity, which are processed according to heuristic rules. For example, the apparent distance of an object is determined in part by its clarity. The more sharply the object is seen, the closer it appears to be. This rule has some validity, because in any given scene the more distant objects are seen less sharply than nearer objects. However, the reliance on this rule leads to systematic errors in the estimation of distance. Specifically, distances are often overestimated when

visibility is poor because the contours of objects are blurred. On the other hand, distances are often underesti-

The authors are members of the department of psychology at the Hebrew University, Jerusalem, Tsrael.

mated when visibility is good because the objects are seen sharply. Thus, the reliance on clarity as an indication of distance leads to common biases. Such

biases are also found in the intuitive

judgment of probability. This article describes three heuristics that are em-

ployed to assess probabilities and to predict values. Biases to which these heuristics lead are enumerated, and the applied and theoretical implications of these observations are discussed.

Representativeness

Many of the probabilistic questions with which people are concerned belong to one of the following types: What is the probability that object A belongs to class B? What is the probability that event A originates from process B? What is the probability that process B will generate event A? In answering such questions, people typically rely on the representativeness heuristic, in which probabilities are evaluated by the degree to which A is representative of B, that is, by the degree to which A resembles B. For example, when A is highly representative of B, the probability that A originates from B is judged to be high. On the other hand, if A is not similar to B, the probability that A originates from B is judged to be low.

For an illustration of judgment by representativeness, consider an individual who has been described by a former neighbor as follows: "Steve is very shy and withdrawn, invariably helpful, but with little interest in people, or in the world of reality. A meek and tidy soul, he has a need for order and structure, and a passion for detail." How do people assess the probability that Steve is engaged in a particular

occupation from a list of possibilities (for example, farmer, salesman, airline pilot, librarian, or physician)? How do people order these occupations from most to least likely? In the representativeness heuristic, the probability that Steve is a librarian, for example, is assessed by the degree to which he is representative of, or similar to, the stereotype of a librarian. Indeed, research with problems of this type has shown that people order the occupations by probability and by similarity in exactly the same way (1). This approach to the judgment of probability leads to serious errors, because similarity, or representativeness, is not influenced by several factors that should affect judgments of probability.

Insensitivity to prior probability of

outcomes. One of the factors that have

no effect on representativeness but should have a major effect on probability is the prior probability, or base-rate frequency, of the outcomes. In the case of Steve, for example, the fact that there are many more farmers than librarians in the population should enter into any reasonable estimate of the probability that Steve is a librarian rather than a farmer. Considerations of

base-rate frequency, however, do not affect the similarity of Steve to the stereotypes of librarians and farmers. If people evaluate probability by rep-

resentativeness, therefore, prior probabilities will be neglected. This hypothesis was tested in an experiment where prior probabilities were manipulated (1). Subjects were shown brief personality descriptions of several individuals, allegedly sampled at random from a group of 100 professionals-engineers and lawyers. The subjects were asked to assess, for each description, the probability that it belonged to an engineer rather than to a lawyer. In one experimental condition, subjects were told

that the group from which the descriptions had been drawn consisted of 70

engineers and 30 lawyers. In another condition, subjects were told that the group consisted of 30 engineers and 70 lawyers. The odds that any particular description belongs to an engineer rather than to a lawyer should be higher in the first condition, where there is a majority of engineers, than in the second condition, where there is a majority of lawyers. Specifically, it can be shown by applying Bayes' rule that the ratio of these odds should be (.7/.3)2, or 5.44, for each description. In a sharp violation of Bayes' rule, the subjects in the two conditions produced essen-

1124

SCIENCE, VOL. 185

tially the same probability judgments. sample size. Indeed, when subjects In this problem,the correct pos,terior

Apparently,subjectsevaluatedthe like- assessed the distributions of average odds are 8 to 1 for the 4: 1 sample

lihood that a particulardescriptionbe- height for samples of various sizes, and 16 to 1 for the 12: 8 sample, as-

longed to an engineer ratherthan to a they produced identical distributions. suming equal prior probabilities.How-

lawyer by the degree to which this For example, the probabilityof obtain- ever, most people feel that the first

description was representative of the ing an average height greater than 6 sampleprovidesmuch strongerevidence

two stereotypes,with little or no regard feet was assigned the same value for for the hypothesisithatthe urn is pre-

for the prior probabilitiesof the cate- samplesof 1000, 100, and 10 men (2). dominantlyred, because the proportion

gories.

Moreover, subjects failed to appreciate of red balls is larger in the first than in

The subjects used prior probabilities the role of sample size even when it the second sample. Here again, intuitive

correctlywhen they had no other infor- was emphasized in the formulation of judgmentsare dominatedby the sample

mation. In the absence of a personality the problem. Consider the following proportionand areessentiallyunaffected

sketch, they judged the probabilitythat question:

by the size of the sample, which plays

an unknown individual is an engineer to be .7 and .3, respectively,in the two base-rate conditions. However, prior

probabilities were effectively ignored

A certaintown is servedby two hospitals. In the larger hospital about 45 babies are born each day, and in the smallerhospitalabout 15 babiesare born

a crucial role in the determinationof

the actual posterior odds (2). In addition, intuitive estimates of posterior odds are far less extreme than the cor-

when a description was introduced, even when this descriptionwas totally uninformative. The responses to the following descriptionillustratethis phe-

each day.As you know,about50 percent rect values. The underestimationof the

of all babiesare boys.However,the exact percentagevariesfromday to day. Sometimes it may be higher than 50 percent, sometimeslower.

impact of evidence has been observed repeatedlyin problemsof thistype (3, 4). It has been labeled "conservatism."

nomenon:

For a periodof 1 year, each hospital Misconceptionsof chance. People ex-

Dick is a 30 year old man. He is married with no children.A man of high abilityand high motivation,he promises to be quite successfulin his field. He is well liked by his colleagues. This descriptionwas intendedto convey no informationrelevantto the question

recordedthe dayson whichmorethan60

percent of the babies born were boys.

Which hospital do you think recorded

more such days?

-

-

The larger hospital (21) The smallerhospital (21) A!boutthe same (that is,

within 5

percentof each other) (53)

pect that a sequence of events generated by a random process will representthe essential characteristicsof that process even when the sequence is short. In

considering tosses of a coin for heads

or tails, for example, people regard the sequence H-T-H-T-T-H to be more

of whether Dick is an engineer or a lawyer. Consequently, the probability that Dick is an engineer should equal

The values in parenthesesare the number of undergraduate students who chose each answer.

likely than the sequence H-H-H-T-T-T,

which does not appear random, and also more likely than the sequenceH-H-

the proportion of engineers in the group, as if no description had been given. The subjects, however, judged

Most subjects judged the probability of obtainingmore than 60 percentboys to be the same in the small and in the

H-H-T-H, which does not representthe fairness of the coin (2). Thus, people

expect that the essential characteristics

the probabilityof Dick being an engineer to be .5 regardlessof whether the stated proportion of engineers in the group was .7 or .3. Evidently, people respond differentlywhen given no evidence and when given worthless evidence. When no specific evidence is given, prior probabilities are properly utilized; when worthless evidence is

given, prior probabilities are ignored

(1).

Insensitivity to sample size. To evaluate the probabilityof obtaininga particular result in a sample drawn from a specified population, people typically apply the representativenessheuristic. That is, they assess the likelihood of a sample result, for example, that the average height in a random sample of ten men will be 6 feet (180 centimeters), by the similarityof this result

large hospital,presumablybecausethese events are described by the same statistic and are therefore equally representative of the general population. In contrast, sampling theory entails that the expected number of days on which more than 60 percent of ithebabies are boys is much greater in the small hospital than in the large one, because a large sample is less likely to stray from

50 percent. This fundamental notion of statistics is evidently not part of people'srepertoireof intuitions.

A similar insensitivityto sample size has been reportedin judgmentsof posterior probability,that is, of the probability that a sample has been drawn from one population rather than from another. Consider the following ex-

ample:

of the process will be represented,not only globally in the entire sequence, but also locally in each of its parts. A locally representative sequence, however, deviatessystematicallyfrom chance expectation: it contains too many alternations and too few runs. Another

consequence of the belief in local representativenessis the well-known gambler's fallacy. After observing a long run of red on the roulette wheel. for

example, most people erroneously believe that black is now due, presumably because the occurrence of black will

resultin a more representativesequence than the occurrence of an additional

red. Chance is commonly viewed as a self-correctingprocess in which a deviation in one direction induces a devia-

tion in the opposite direction to restore

the equilibrium.In fact, deviations are

to the corresponding parameter (that Imagine an urn filled with balls, of not "corrected" as a chance process

is, to the average height in the population of men). The similarityof a sample statistic to a population parameter

which 2/3 are of one color and ?3 of another. One individual has drawn 5 balls ,from the urn, and found that 4 were red and 1 was white Another individual has

unfolds, they are merely diluted. Misconceptions of chance are not

limited to naive subjects. A study of

does not depend on the size of the drawn 20 balls and found that 12 were the statistical intuitions of experienced

sample. Consequently, if probabilities are assessedby representativenesst,hen the judged probabilityof a sample statistic will be essentially independentof

red and 8 were white. Which of the two individuals should feel more confident that the urncontains2/3 redballsand1/3 white balls,ratherthanthe opposite?Whatodds shouldeach individualgive?

research psychologists (5) revealed a

lingering belief in what may be called the "law of small numbers,"according to which even small samples are highly

27 SEPTEMBER1974

1125

representativeof the populations from dent teacher during a particular prac-

which they are drawn. The responses tice lesson. Some subjects were asked

of these investigatorsreflected the ex- to evaluate the quality of the lesson

pectation that a valid hypothesis about described in the paragraph in percentile

a population will be representedby a scores, relative to a specified population.

statisticallysignificantresult in a sam- Other subjects were asked to predict,

ple-with little regard for its size. As also in percentile scores, the standing

a consequence,the researchersput too of each student teacher 5 years after

much faith in the results of small sam- the practice lesson. The judgments made

ples and grossly overestimated the under the two conditions were identical.

replicability of such results. In the That is, the prediction of a remote

actual conduct of research, this bias criterion (success of a teacher after 5

leads to ithe selection of samples of years) was identical to the evaluation

inadequatesize andto overinterpretation of the information on which the predic-

of findings.

tion was based (the quality of the

Insensitivityto predictability.People practice lesson). The students who made

are sometimescalled upon to make such these predictions were undoubtedly

numericalpredictionsas the futurevalue aware of the limited predictability of

of a stock, the demand for a commod- teaching competence on the basis of a

ity, or the outcome of a football game. single trial lesson 5 years earlier; never-

Such predictions are often made by theless, their predictions were as ex-

representativeness.For example, sup- treme as their evaluations.

pose one is given a description of a The illusion of validity. As we have

company and is asked to predict its seen, people often predict by selecting

future profit. If the description of ithe the outcome (for example, an occupa-

company is very favorable, a very tion) that is most representative of the

high profit will appear most represen- input (for example, the description of

tative of that description;if the descrip- a person). The confidence they have

tion is mediocre, a mediocre perform- in their prediction depends primarily

ance will appear most representative. on the degree of representativeness

The degree to which the description is (that is, on the quality of the match

favorableis unaffectedby the reliability between the selected outcome and the

of that descriptionor by the degree to input) with little or no regard for the

which it permits accurate prediction. factors that limit predictive accuracy.

Hence, if people predict solely in terms Thus, people express great confidence

of the favorablenessof the description, in the prediction that a person is a

their predictions will be insensitive to librarian when given a description of

the reliability of the evidence and to his personality which matches the

the expectedaccuracyof the prediction. stereotype of librarians, even if the

This mode of judgment violates the description is scanty, unreliable, or out-

normative statistical theory in which dated. The unwarranted confidence

the extremenessand the range of pre- which is produced by a good fit between

dictionsare controlledby considerations the predicted outcome and the input

of predictability. When predictability information may be called the illusion

is nil, the same prediction should be of validity. This illusion persists even

made in all cases. For example, if the when the judge is aware of the factors

descriptions of companies provide no that limit the accuracy of his predic-

informationrelevant to profit, then the tions. It is a common observation that

same value (such as average profit) psychologists who conduct selection

should be predicted for all companies. interviews often experience considerable

If predictabilityis perfect, of course, confidence in their predictions, even

the values predicted will match the when they know of the vast literature

actual values and the range of predic- that shows selection interviews to be

tions will equal the range of outcomes. highly fallible. The continued reliance

In general,the higherthe predictability, on the clinical interview for selection,

the wider the rangeof predictedvalues. despite repeated demonstrations of its

Several studies of numerical predic- inadequacy, amply attests to the strength

tion have demonstrated that intuitive of this effect.

predictions violate this rule, and that The internal consistency of a pattern

subjects show little or no regard for of inputs is a major determinant of

considerations of predictability (1). In one's confidence in predictions based

one of these studies, subjects were pre- on these inputs. For example, people

sented with several paragraphs, each express more confidence in predicting the

describing the performance of a stu- final grade-point average of a student

1126

whose first-year record consists entirely of B's than in predicting the gradepoint average of a student whose firstyear record includes many A's and C's. Highly consistent patterns are most often observed when the input variables are highly redundant or correlated. Hence, people tend to have great confidence in predictions based on redundant input variables. However, an elementary result in the statistics of correlation asserts that, given input variables of stated validity, a prediction based on several such inputs can achieve higher accuracy when they are independent of each other than when they are redundant or correlated. Thus, redundancy among inputs decreases accuracy even as it increases confidence, and people are often confident in predictions that are quite likely to be off the mark (1).

Misconceptions of regression. Suppose a large group of children has been examined on two equivalent versions of an aptitude test. If one selects ten children from among those who did best on one of the two versions, he will usually find their performance on the second version to be somewhat disappointing. Conversely, if one selects ten children from among those who did worst on one version, they will be found, on the average, to do somewhat better on the other version. More generally, consider two variables X and Y which have the

same distribution. If one selects indi-

viduals whose average X score deviates from the mean of X by k units, then the average of their Y scores will usually deviate from the mean of Y by less than k units. These observations illus-

trate a general phenomenon known as regression toward the mean, which was first documented by Galton more than 100 years ago.

In the normal course of life, one

encounters many instances of regression toward the mean, in the comparison of the height of fathers and sons, of the intelligence of husbands and wives, or of the performance of individuals on consecutive examinations. Neverthe-

less, people do not develop correct intuitions about this phenomenon. First, they do not expect regression in many contexts where it is bound to occur.

Second, when they recognize the occurrence of regression, they often invent spurious causal explanations for it (1). We suggest that the phenomenon of regression remains elusive because it is incompatible with the belief that the predicted outcome should be maximally

SCIENCE, VOL. 185

representativeof the input, and, hence, which instances or occurrencescan be begin with r (road) and words that

that the value of the outcome variable broughtto mind. For example,one may have r in the third position (car) and

should be as extreme as the value of assess the risk of heart attack among assess the relative frequency by the

the input variable.

middle-aged people by recalling such ease with which words of the two types

The failure to recognize the import occurrencesamong one's acquaintances. come to mind. Becauseit is much easier

of regressioncan have pernicious con- Similarly,one may evaluate the proba- to search for words by their first letter

sequences, as illustratedby the follow- bility that a given businessventure will than by their third letter, most people ing observation (1). In a discussion fail by imagining various difficulties it judge words that begin with a given of flight training, experienced instruc- could encounter. This judgmentalheu- consonant to be more numerous than

tors noted that praise for an exceptionally smoothlandingis typicallyfollowed by a poorer landing on the next try,

while harsh criticism after a rough

landing is usually followed by an improvementon the next try. The instruc-

ristic is called availability. Availability is a useful clue for assessingfrequency or probability, because instances of large classes are usually recalled better and faster than instances of less fre-

quent classes. However, availability is

words in which the.same consonantappears in the third position. They do so even for consonants, such as r or k, that are more frequent in the third

position than in the first (6). Different tasks elicit different search

tors concluded that verbal rewardsare

detrimental to learning, while verbal punishmentsare beneficial,contraryto accepted psychological doctrine. This conclusion is unwarrantedbecause of

the presence of regression toward ithe mean. As in other cases of repeated examination,an improvementwill usually follow a poor performance and a deteriorationwill usually follow an outstanding performance, even if the instructor does not respond to ,the trainee's achievement on the first at-

tempt. Because the instructors had praised their trainees after good landings and admonished them after poor ones, they reached the erroneous and potentiallyharmfulconclusionthat punishment is more effective than reward.

Thus, the failure to understand the

effect of regression leads one to overestimate the effectiveness of punishment and to underestimate the effec-

tivenessof reward.In social interaction, as well as in training,rewardsare typically administeredwhen performance is good, and punishmentsare typically administered when performance is

poor. By regression alone, therefore, behavioris most likely to improveafter punishmentand most likely to deteriorate after reward. Consequently, the humanconditionis such that, by chance alone, one is most often rewarded for

punishing others and most often punished for rewarding them. People are generallynot awareof this contingency. In fact, the elusive role of regression in determining the apparent conse-

affectedby factors other than frequency and probability. Consequently, the reliance on availabilityleads to predictable biases, some of which are illustrated

below.

Biases due to the retrievability of instances. When the size of a class is

judged by the availability of its instances, a class whose instances are

easily retrieved will appear more numerous than a class of equal frequency whose instances are less retrievable.In

an elementarydemonstrationof this ef-

fect, subjectsheard a list of well-known personalities of both sexes and were subsequentlyaskedto judgewhetherthe list contained more names of men than

of women. Differentlists were presented to differentgroups of subjects.In some of the lists the men were relativelymore famous than the women, and in others

the womenwere relativelymore famous than the men. In each of the lists, the

subjects erroneously judged that the class (sex) that had the more famous

personalities was the more numerous

(6).

In addition to familiarity, there are other factors, such as salience, which affectthe retrievabilityof instances.For example, the impact of seeing a house burningon the subjectiveprobabilityof such accidentsis probablygreaterthan the impact of reading about a fire in the local paper.Furthermore,recent occurrences are likely to be relatively more availablethan earlieroccurrences.

It is a common experience that the

sets. For example, suppose you are asked to rate the frequency with which abstractwords (thought, love) and concrete words (door, water) appear in written English. A natural way to answer this question is to search for contexts in which the word could appear. It seems easier to think of

contexts in which an abstract concept is mentioned (love in love stories) than to think of contexts in which a concrete

word (such as door) is mentioned. If the frequencyof words is judgedby the availability of the contexts in which

they appear, abstract words will be judgedas relativelymore numerousthan concrete words. This bias has been ob-

served in a recent study (7) which showed that the judged frequency of occurrenceof abstractwords was much

higher than that of concrete words, equatedin objectivefrequency.Abstract words were also judged to appear in a much greater variety of contexts than concrete words.

Biases of imaginability. Sometimes one has to assess the frequency of a class whose instances are not stored in

memory but can be generated according to a given rule. In such situations, one typicallygeneratesseveralinstances and evaluates frequency or probability by the ease with which the relevant in-

stances can be constructed. However, the ease of constructinginstances does not alwaysreflecttheir actualfrequency, and this mode of evaluation is prone to biases.To illustrate,considera group

quences of reward and punishment subjectiveprobabilityof trafficaccidents of 10 people who form committees of

seems to have escapedthe notice of stu- rises temporarilywhen one sees a car k members, 2 < k < 8. How many

dents of this area.

overturnedby the side of the road.

differentcommittees of k memberscan

Biases due to the effectiveness of a be formed? The correct answer to this

searchset. Supposeone samplesa word problem is given by the binomial coef-

Availability There are situations in which people

assess the frequency of a class or the

(of three letters or more) at random from an English text. Is it more likely that the word starts with r or that

r is the third letter? People approach

ofifci2er-n5ki r2t\/

(10) for

wrrihvriIc/~hil k = 5.

reacheOs a mIUaLLxUiLmum Clearly, the number

of committees of k members equals the number of committees of (10 - k)

probabilityof an event by the ease with this problem by recalling words that members, because any committee of k

27 SEPTEMBER 1974

1127

members defines a unique group of (10 - k) nonmembers.

One way to answer this question without computation is to mentally construct committees of k members and

to evaluate their number by the ease with which they come to mind. Committees of few members, say 2, are more available than committees of many members, say 8. The simplest scheme for the construction of committees is a

partition of the group into disjoint sets. One readily sees that it is easy to construct five disjoint committees of 2 members, while it is impossible to generate even two disjoint committees of 8 members. Consequently, if frequency is assessed by imaginability, or by availability for construction, the small committees will appear more numerous than larger committees, in contrast to the correct bell-shaped function. Indeed, when naive subjects were asked to estimate the number of distinct

committees of various sizes, their esti-

mates were a decreasing monotonic function of committee size (6). For example, the median estimate of the number of committees of 2 members

was 70, while the estimate for com-

mittees of 8 members was 20 (the correct answer is 45 in both cases).

Imaginability plays an important role in the evaluation of probabilities in reallife situations. The risk involved in an

adventurous expedition, for example, is evaluated by imagining contingencies with which the expedition is not equipped to cope. If many such difficulties are vividly portrayed, the expedition can be made to appear exceedingly dangerous, although the ease with which disasters are imagined need not reflect their actual likelihood. Conversely, the risk involved in an undertaking may be grossly underestimated if some possible dangers are either difficult to conceive of, or simply do not come to mind.

Illusory correlation. Chapman and Chapman (8) have described an interesting bias in the judgment of the frequency with which two events co-occur. They presented naive judges with information concerning several hypothetical mental patients. The data for each patient consisted of a clinical diagnosis and a drawing of a person made by the patient. Later the judges estimated the frequency with which each diagnosis

(such as paranoia or suspiciousness) had been accompanied by various features of the drawing (such as peculiar eyes). The subjects markedly overestimated the frequency of co-occurrence of

1128

natural associates, such as suspicious- That is, different starting points yield ness and peculiar eyes. This effect was different estimates, which are biased

labeled illusory correlation.In their er- toward the initial values. We call this

roneousjudgmentsof the data to which phenomenon anchoring. they had been exposed, naive subjects Insufficient adjustment. In a demon-

"rediscovered"much of the common, but unfounded, clinical lore concern-

ing the interpretation of the draw-aperson test. The illusory correlation effect was extremely resistant to contradictorydata. It persistedeven when

stration of the anchoring effect, subjects were asked to estimate various quantities, stated in percentages (for example, the percentage of African countries in

the United Nations). For each quantity, a number between 0 and 100 was deter-

the correlation between symptom and mined by spinning a wheel of fortune diagnosis was actually negative, and it in the subjects' presence. The subjects prevented the judges from detecting were instructed to indicate first whether

relationshipsthat were in fact present. that number was higher or lower than

Availability provides a natural ac- the value of the quantity, and then to count for the illusory-correlationeffect. estimate the value of the quantity by The judgment of how frequently two moving upward or downward from the

events co-occur could be based on the given number. Different groups were strengthof the associativebond between given different numbers for each quanthem. When the association is strong, tity, and these arbitrary numbers had a

one is likely to conclude that the events marked effect on estimates. For example, have been frequently paired. Conse- the median estimates of the percentage quently,strongassociateswill be judged of African countries in the United Na-

to have occurred together frequently. tions were 25 and 45 for groups that reAccording to this view, the illusory ceived 10 and 65, respectively, as start-

correlation between suspiciousnessand ing points. Payoffs for accuracy did not

peculiar drawing of the eyes, for ex- reduce the anchoring effect.

ample, is due to the fact that suspi-

Anchoring occurs not only when the

ciousnessis morereadilyassociatedwith starting point is given to the subject,

the eyes than with any other part of but also when the subject bases his

the body.

estimate on the result of some incom-

Lifelong experience has taught us plete computation. A study of intuitive that, in general, instances of large numerical estimation illustrates this ef-

classes are recalled better and faster fect. Two groups of high school students

than instances of less frequent classes; estimated, within 5 seconds, a numerical

that likely occurrences are easier to expression that was written on the

imagine than unlikely ones; and that blackboard. One group estimated the

the associative connections between product

events are strengthenedwhen the events frequently co-occur. As a result, man

8X7X6XSX4x3X2X1

has at his disposal a procedure (the while another group estimated the

availabilityheuristic)for estimatingthe product

numerosityof a class, the likelihood of an event, or the frequencyof co-occur-

1x2x3x4x5X6x7X8

rences, by the ease with which the To rapidly answer such questions, peo-

relevant mental operationsof retrieval, ple may perform a few steps of compu-

construction, or association can be tation and estimate the product by

performed. However, as the preceding extrapolation or adjustment. Because ad-

examples have demonstrated,this valu- justments are typically insufficient, this

able estimation procedure results in procedure should lead to underestima-

systematic errors.

tion. Furthermore, because the result of

the first few steps of multiplication (per-

formed from left to right) is higher in

Adjustmentand Anchoring

the descending sequence than in the

ascending sequence, the former expres-

In many situations,people make esti- sion should be judged larger than the

mates by startingfrom an initial value latter. Both predictions were confirmed.

that is adjustedto yield the final answer. The median estimate for the ascending

The initial value, or starting point, may sequence was 512, while the median

be suggested by the formulation of the estimate for the descending sequence

problem, or it may be the result of a was 2,250. The correct answer is 40,320.

partial computation. In either case,

Biases in the evaluation of conjunc-

adjustments are typically insufficient (4). tive and disjunctive events. In a recent

SCIENCE, VOL. 185

study by Bar-Hillel (9) subjects were large. The generaltendencyto overesti- tainedprobabilitydistributionsfor many

given the opportunityto bet on one of mate the probability of conjunctive quantities from a large number of

two events. Three types of events were eventsleads to unwarrantedoptimismin judges. These distributions indicated

used: (i) simple events, such as drawing the evaluation of the likelihood that a large and systematic departures from

a red marblefrom a bag containing 50 plan will succeed or that a project will proper calibration.In most studies, the

percent red marbles and 50 percent be completed on time. Conversely, dis- actual values of the assessed quantities

white marbles; (ii) conjunctive events, junctivestructuresare typicallyencoun- are either smaller than X0l or greater

such as drawing a red marble seven tered in the evaluationof risks.A com- than X09 for about 30 percent of the

times in succession, with replacement, plex system, such as a nuclear reactor problems. That is, the subjects state

from a bag containing 90 percent red or a human body, will malfunction if overlynarrowconfidenceintervalswhich

marbles and 10 percent white marbles; any of its essential components fails. reflectmore certaintythan is justifiedby

and (iii) disjunctive events, such as Even when the likelihood of failure in their knowledge about the assessed

drawing a red marble at least once in each componentis slight,the probability quantities. This bias is common to

seven successivetries,with replacement, of an overall failure can be high if naive and to sophisticatedsubjects,and

from a bag containing 10 percent red many components are involved. Be- it is not eliminatedby introducingprop-

marblesand 90 percent white marbles. cause of anchoring,people will tend to er scoringrules,whichprovideincentives

In this problem, a significant majority underestimatethe probabilitiesof failure for externalcalibration.This effectis at-

of subjectspreferredto bet on the con- in complex systems. Thus, the direc- tributable,in partat least, to anchoring.

junctiveevent (the probabilityof which tion of the anchoring bias can some- To select X90 for the value of the

is .48) ratherthan on the simple event times be inferredfrom the structureof Dow-Jones average, for example, it is

(the probabilityof which is .50). Sub- the event. The chain-like structure of naturalto begin by thinkingabout one's

jects also preferredto bet on the simple conjunctionsleads to overestimation,the best estimate of the Dow-Jones and to

event rather than on the disjunctive funnel-like structure of disjunctions adjust this value upward. If this adjust-

event, which has a probability of .52. leads to underestimation.

ment-like most others-is insufficient,

Thus, most subjectsbet on the less likely Anchoring in the assessment of sub- then X9owill not be sufficientlyextreme.

event in both comparisons.This pattern jective probability distributions. In deci- A similaranchoringeffect will occur in

of choices illustratesa general finding. sion analysis,expertsare often required the selectionof X0,, whichis presumably

Studies of choice among gambles and to expresstheir beliefs abouta quantity, obtained by adjusting one's best esti-

of judgments of probability indicate such as the value of the Dow-Jones mate downward.Consequently,the con-

that people tend to overestimate the average on a particular day, in the fidence interval between X1Oand X90

probability of conjunctive events (10) form of a probabilitydistribution.Such will be too narrow, and the assessed

and to underestimatethe probabilityof a distributionis usually constructedby probabilitydistributionwill be too tight.

disjunctive events. These biases are asking the person to select values of In support of this interpretationit can

readily explained as effects of anchor- the quantitythat correspondto specified be shown that subjective probabilities

ing. The stated probability of the percentilesof his subjectiveprobability are systematically altered by a proce-

elementary event (success at any one distribution. For example, the judge dure in which one's best estimate does

stage) provides a natural starting point may be asked to select a number, X90, not serve as an anchor.

for the estimationof the probabilitiesof such that his subjectiveprobabilitythat Subjective probability distributions

both conjunctiveand disjunctiveevents. this number will be higher than, the for a given quantity (the Dow-Jones

Since adjustmentfrom the startingpoint value of the Dow-Jones average is .90. average)can be obtained in two differ-

is typically insufficient, the final esti- That is, he should select the value X90 ent ways: (i) by asking the subject to

matesremaintoo close to the probabili- so that he is just willing to accept 9 to select values of the Dow-Jones that

ties of the elementary events in both 1 odds that the Dow-Jones averagewill correspond to specified percentiles of

cases. Note that the overall probability not exceed it. A subjective probability his probabilitydistributionand (ii) by

of a conjunctive event is lower than distributionfor the value of the Dow- asking the subject to assess the prob-

the probability of each elementary Jones average can be constructedfrom abilities that the true value of the

event, whereasthe overallprobabilityof severalsuch judgmentscorrespondingto Dow-Jones will exceed some specified

a disjunctive event is higher than the different percentiles.

values.The two proceduresare formally

probability of each elementary event. By collecting subjective probability equivalent and should yield identical

As a consequence of anchoring, the distributionsfor many differentquanti- distributions.However,they suggestdif-

overallprobabilitywill be overestimated ties, it is possible to test the judge for ferent modes of adjustmentfrom differ-

in conjunctiveproblems and underesti- proper calibration.A judge is properly cent anchors. In procedure (i), the

mated in disjunctiveproblems.

(or externally) calibrated in a set of naturalstartingpoint is one's best esti-

Biasesin the evaluationof compound problems if exactly II percent of the mate of the quantity. In procedure(ii),

events are particularlysignificantin the true values of the assessed quantities on the other hand, the subject may be

context of planning. The successful falls below his stated values of Xr. For anchored on the value stated in the

completion of an undertaking,such as example, the true values should fall question. Alternatively,he may be an-

the developmentof a new product,typi- below X0l for 1 percent of the quanti- chored on even odds, or 50-50 chances,

cally has a conjunctive character: for ties and above X99 for 1 percent of the which is a natural starting point in the

the undertakingto succeed, each of a quantities.Thus, the true values should estimationof likelihood. In either case,

series of events must occur. Even when fall in the confidence interval between procedure(ii) should yield less extreme

each of these events is very likely, the X01and X99on 98 percentof the prob- odds than procedure (i).

overall probability of success can be lems.

To contrast the two procedures, a

quite low if the number of events is Several investigators (11) have ob- set of 24 quantities(such as the air dis-

27 SEPTEMBER1974

1129

tance from New Delhi to Peking) was It is not surprisingthat useful heuris- subjective interpretationof probability

presented to a group of subjects who tics such as representativeness and that is applicableto unique events and

assessedeitherXI0 or X90for each prob- availability are retained, even though is embeddedin a general theory of ra-

lem. Another group of subjects re- they occasionally lead to errors in pre- tional decision.

ceived the medianjudgmentof the first diction or estimation.What is perhaps It shouldperhapsbe notedthat, while

group for each of the 24 quantities. surprising is the failure of people to subjective probabilities can sometimes

They were asked to assess the odds that infer from lifelong experience such be inferred from preferences among

each of the given values exceeded the fundamental statistical rules as regres- bets, they are normally not formed in

true value of the relevant quantity. In sion toward the mean, or the effect of this fashion. A person bets on team A

the absence of any bias, the second sample size on samplingvariability.Al- rather than on team B because he be-

group should retrievethe odds specified though everyoneis exposed,in the nor- lieves that team A is more likely to

to the first group, that is, 9:1. How- mal course of life, to numerous ex- win; he does not infer this belief from

ever, if even odds or the stated value amples from which these rules could his bettingpreferences.Thus, in reality,

serve as anchors, the odds of the sec- have been induced, very few people subjective probabilitiesdeterminepref-

ond group should be less extreme, that discoverthe principlesof samplingand erences among bets and are not de-

is, closer to 1:1. Indeed, the median regressionon their own. Statisticalprin- rived from them, as in the axiomatic

odds stated by this group, across all ciples are not learned from everyday theory of rational decision (12).

problems, were 3:1. When the judg- experience because the relevant in- The inherently subjective nature of

ments of the two groups were tested stancesare not coded appropriatelyF. or probabilityhas led many studentsto the

for external calibration, it was found example, people do not discover that belief that coherence, or internal con-

that subjectsin the firstgroupwere too successivelines in a text differ more in sistency, is the only valid criterion by

extreme, in accord with earlier studies. averageword length than do successive which judged probabilities should be

The events that they defined as having pages, because they simply do not at- evaluated. From the standpoint of the

a probabilityof .10 actuallyobtainedin tend to the average word length of in- formal theory of subjectiveprobability,

24 percent of the cases. In contrast, dividual lines or pages. Thus, people any set of internallyconsistentprobabil-

subjects in the second group were too do not learnthe relationbetweensample ity judgmentsis as good as any other.

conservative.Events to which they as- size and sampling variability,although This criterionis not entirelysatisfactory,

signed an average probability of .34 the data for such learningare abundant. because an internally consistent set of

actually obtained in 26 percent of the The lack of an appropriatecode also subjective probabilitiescan be incom-

cases. These results illustratethe man- explains why people usually do not patible with other beliefs held by the

ner in which the degree of calibration detect the biases in their judgmentsof individual. Consider a person whose

dependson the procedureof elicitation. probability.A personcould conceivably subjective probabilitiesfor all possible

learn whether his judgmentsare exter- outcomes of a coin-tossinggame reflect

nally calibratedby keepinga tally of the the gambler'sfallacy. That is, his esti-

Discussion

proportionof events that actuallyoccur mate of the probability of tails on a

among those to which he assigns the particulartoss increaseswith the num-

This article has been concernedwith same probability. However, it is not ber of consecutiveheads that preceded

cognitive biases that stem from the reli- naturalto group events by their judged that toss. The judgmentsof such a per-

ance on judgmental heuristics. These probability. In the absence of such son could be internally consistent and

biases are not attributableto motiva- grouping it is impossiblefor an indivi- therefore acceptable as adequate sub-

tional effectssuch as wishfulthinkingor dual to discover,for example,that only jective probabilities according to the

the distortion of judgmentsby payoffs 50 percent of the predictionsto which criterion of the formal theory. These

and penalties. Indeed, several of the he has assigned a probabilityof .9 or probabilities,however,are incompatible

severe errors of judgment reported higher actually came true.

with the generally held belief that a

earlier occurred despite the fact that The empirical analysis of cognitive coin has no memoryand is thereforein-

subjectswere encouragedto be accurate biases has implicationsfor the theoreti- capable of generating sequential de-

and were rewarded for the correct cal and appliedrole of judgedprobabili- pendencies.For judged probabilitiesto

answers (2, 6).

ties. Modern decision theory (12, 13) be consideredadequate,or rational,in-

The reliance on heuristics and the regards subjective probability as the ternal consistency is not enough. The

prevalence of biases are not restricted quantifiedopinion of an idealized per- judgmentsmust be compatiblewith the

to laymen. Experiencedresearchersare son. Specifically, the subjective proba- entire web of beliefs held by the in-

also prone to the same biases-when bility of a given event is definedby the dividual. Unfortunately, there can be

they think intuitively.For example, the set of bets about this event that such a no simple formal procedurefor assess-

tendency to predict the outcome that person is willing to accept. An inter- ing the compatibilityof a set of proba-

best representsthe data,with insufficient nally consistent,or coherent, subjective bility judgments with the judge's total

regard for prior probability,has been probabilitymeasurecan be derived for system of beliefs. The rational judge

observed in the intuitive judgments of an individualif his choices among bets will neverthelessstrivefor compatibility,

individuals who have had extensive satisfy certain principles, that is, the even though internal consistency is

training in statistics (1, 5). Although axiomsof the theory.The derivedprob- more easily achieved and assessed. In

the statistically sophisticated avoid ability is subjective in the sense that particular,he will attemptto make his

elementaryerrors,such as the gambler's differentindividualsare allowedto have probabilityjudgments compatible with

fallacy, their intuitive judgments are differentprobabilitiesfor the sameevent. his knowledge about the subject mat-

liable to similar fallacies in more in- The major contribution of this ap- ter, the laws of probability,and his own

tricate and less transparentproblems. proach is that it provides a rigorous judgmental heuristics and biases.

1130

SCIENCE, VOL. 185

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download