EXPERIMENTAL AND QUASI-EXPERIMENTAL DESIGNS FOR ... - CGIAR

[Pages:82]EXPERIMENTAALND

QUASI-EXPERIMENTAL

DESIGNSFORGENERALIZED

ii:.

CAUSALINFERENCE

.jr-*"- **

'"+.',-, i L l i "

fr

William R.Shadish

Trru UNIvERSITYop MEvPrrts

ThomasD. Cook

NonrrrwpsrERN UNrvPnslrY

Donald T.Campbell

HOUGHTONMIFFLINCOMPANY Boston New York

2002

E x p e r i m e n tasn d G e n e r a l i z eCda u s a l lnference

Ex.per'i'ment (ik-spEr'e-mant):[Middle English from Old French from Latin experimentum, from experiri, to try; seeper- in Indo-European Roots.] n. Abbr. exp., expt, 1. a. A test under controlled conditions that is made to demonstratea known truth, examine the validity of a hypothesis,or determinethe efficacyof something previously untried' b. The processof conducting such a test; experimentation. 2' An innovative act or procedure: "Democracy is only an experiment in gouernment" (.V{illiam Ralph lnge).

Cause(k6z): [Middle English from Old French from Latin causa' teason, purpose.]n. 1. a. The producer of an effect,result, or consequence. b. The one, such as a person, an event' or a condition, that is responsible for an action or a result. v. 1. To be the causeof or reasonfor; result in. 2. To bring about or compel by authority or force.

o MANv historiansand philosophers,the increasedemphasison experimenta-

tion in the 15th and L7th centuriesmarked the emergenceof modern science

Gaflirloemo'si't1s.6ro'!.ot2rtesaitnrsneBaotudrieasl pThbialot SsotapyhAyt(oHpa'cWkaintegr,,o1r9M8o3u).eDinraIkt eas(1u9sh8e1r)incigtiens

modern experimentalscience,but earlier claims can be made favoring \Tilliam

Gilbert's1,600studyOnthe Loadstoneand MagneticBodies,Leonardoda Vinci's

(1,452-1.51.m9)any investigationsa,nd perhapseventhe Sth-centuryB.C.philoso-

pPhaermr Eemnipdeeds(oJcolneess,w,'h1o.9u6s9ead,1v'9a6ri9obu)s.Inemthpeiriecvael rdyedmayosnesntrsaetoiofntshteo

argue against term, humans

havebeenexperimentingwith differentways of doing things from the earliestmo-

mentsof their history.Suchexperimentingis as natural a part of our life astrying

a new recipeor a differentway of starting campfires.

z | 1. EXeERTMEANNTDsGENERALTczAEUDsALINFERENcE I

However, the scientific revolution of the 1.7thcentury departedin three ways from the common useof observationin natural philosophy atthat time. First, it increasinglyusedobservation to correct errors in theory. Throughout historg natural philosophers often used observation in their theories, usually to win philosophical arguments by finding observations that supported their theories. However, they still subordinated the useof observation to the practice of deriving theoriesfrom "first principles," starting points that humansknow to betrue by our nature or by divine revelation (e.g.,the assumedproperties of the four basic elements of fire, water, earth, and air in Aristotelian natural philosophy). According to someaccounts,this subordination of evidenceto theory degeneratedin the 17th century: "The Aristotelian principle of appealing to experiencehad degenerated among philosophers into dependenceon reasoning supported by casualexamples and the refutation of opponents by pointing to apparent exceptionsnot carefully examined" (Drake, '1,98"1p..,xxi).'Sfhen some 17th-century scholarsthen beganto use observation to correct apparent errors in theoretical and religious first principles, they came into conflict with religious or philosophical authorities, as in the caseof the Inquisition's demandsthat Galileo recant his account of the earth revolving around the sun. Given suchhazards,the fact that the new experimentalsciencetipped the balancetoward observation and ^way from dogma is remarkable. By the time Galileo died, the role of systematicobservation was firmly entrenched as a central feature of science,and it has remained so ever since(Harr6,1981).

Second,before the 17th century, appealsto experiencewere usually basedon passiveobservation of ongoing systemsrather than on observation of what happens after a system is deliberately changed. After the scientific revolution in the L7th centurS the word experiment (terms in boldface in this book are defined in the Glossary) came to connote taking a deliberate action followed by systematic observationof what occurredafterward. As Hacking (1983) noted of FrancisBacon: "He taught that not only must we observenature in the raw, but that we must also 'twist the lion's tale', that is, manipulate our world in order to learn its secrets" (p. U9). Although passiveobservationrevealsmuch about the world, active manipulation is required to discoversomeof the world's regularitiesand possibilities (Greenwood,, 1989). As a mundane example, stainlesssteel does not occur naturally; humans must manipulate it into existence.Experimental science came to be concernedwith observing the effectsof such manipulations.

Third, early experimenters realizedthe desirability of controlling extraneous influences that might limit or bias observation. So telescopeswere carried to higher points at which the air was clearer,the glass for microscopeswas ground ever more accuratelg and scientistsconstructed laboratories in which it was possible to usewalls to keep out potentially biasing ether waves and to use (eventually sterilized)test tubes to keep out dust or bacteria. At first, thesecontrols were developedfor astronomg chemistrg and physics,the natural sciencesin which interest in sciencefirst bloomed. But when scientistsstarted to use experiments in areas such as public health or education, in which extraneous influences are harder to control (e.g., Lind , 1,753lr,they found that the controls used in natural

EXPERTMENATNSDCAUSATTONI I

sciencein the laboratoryworked poorly in thesenew applications.Sothey developednew methodsof dealingwith extraneousinfluence,suchasrandom assignment (Fisher1, ,925)or addinga nonrandomizedcontrol group (Coover& Angell, 1.907)A. s theoreticaland observationael xperienceaccumulatedacrossthesesettingsandtopics,more sourcesof biaswereidentifiedand more methodsweredevelopedto copewith them(Dehue2, 000).

TodaSthe key featurecommonto all experimentsis still to deliberatelyvary somethingsoasto discoverwhat happensto somethingelselater-to discoverthe effectsof presumedcausesA. s laypersonswe do this, for example,to assesws hat happensto our blood pressureif we exercisemore, to our weight if we diet less, or ro our behaviorif we reada self-helpbook. However,scientificexperimentation has developedincreasinglyspecializedsubstancel,anguage,and tools, includingthe practiceof field experimentationin the socialsciencetshat is the primary focus of this book. This chapter begins to explore these matters by (1)discussingthenatureof causationthat experimentstest,(2)explainingthe specializedterminology(e.g.,randomizedexperimentsq, uasi-experimentsth) at describessocial experiments,(3) introducing the problem of how to generalize causalconnectionsfrom individual experimentsa, nd (4) briefly situatingthe experimentwithin a largerliteratureon the natureof science.

EXPERIMENATNSD CAUSATION

A sensiblediscussionof experimentsrequiresboth a vocabularyfor talking about causationand an understandingof key conceptsthat underliethat vocabulary.

DefiningCauseE, ffect,andCausaRl elationships

Most peopleintuitivelyrecognizecausalrelationshipsin their daily lives.For instance,you may saythat anotherautomobile'shitting yours was a causeof the damageto your car; that the numberof hoursyou spentstudyingwas a causeof your testgradeso; r that theamountof food a friendeatswasa causeof hisweight. You may evenpoint to more complicatedcausalrelationshipsn, oting that a low test gradewas demoralizing,which reducedsubsequenst tudying,which caused evenlower grades.Herethe samevariable(low grade)canbe both a causeand an effect,and there can be a reciprocalrelationshipbetweentwo variables(low gradesand not studying)that causeeachother.

Despitethis intuitivefamiliaritywith causalrelationsbipsa, precisedefinition of causeand effecthaseludedphilosophersfor centuries.lIndeed,the definitions

1. Our analysisrefldctsthe useof the word causationin ordinary language,not the more detaileddiscussionsof causeby philosophersR. eadersinterestedin suchdetail may consulta host of works that we referencein this chapter,includingCook and Campbell(1979).

4 | 1. EXPERTMENATNSD GENERALTZCEADUSAILNFERENCE

of termssuchas cause ande, ffectdependpartly on eachother and on the causal rLeolcaktieosnasihdi:p"iTnhwaht iwchhibcohtphraordeuecmesbaendydseimdS.poletohrec1o7mthp-lceexindteuary,pwheildoesonpohteebrJyothhen g" AenceAratrlnsaeismtehcaatLwtsheiac,hndmthaaktewsahnicyhoitshperrotdhuincged, ,eeitfhfeecrst"im(1p,9le7sid, pe.a,3s2uflbastnadnacleso,or:

mode,beginto be;and an effectis that, which had its beginningfrom someother thing" (p. 325).Sincethen,otherphilosopherasnd scientisthsavegivenus useful

definitionsof the threekey ideas--causee, ffect,and causalrelationship-that are morespecificandthat betterilluminatehow experimentswork. Wewould not defend any of theseasthe true or correctdefinition,giventhat the latterhaseluded philosophersfor millennia;but we do claignthat theseideashelpto clarify the sci-

entificpracticeof probing causes.

Cause

Considerthe causeof a forestfire. 'Weknow that firesstartin differentways-a

match tossedfrom a ca\ a lightning strike,or a smolderingcampfire,for exam-

ple. None of thesecausesis necessarybecausea forestfire can start evenwhen,

say'a all, a

matchis not match must

present.Also, noneof them is sufficientto start stay "hot" long enoughto start combustion;it

the fire. After must contact

combustiblematerialsuchasdry leavest;heremust be oxygenfor combustionto

occur; and the weather must be dry enoughso that the leavesare dry and the

matchis not dousedby rain. Sothe matchis part of a constellationof conditions

without which a firewill not result,althoughsomeof theseconditionscanbeusu-

ally takenfor granted,suchastheavailabilityof oxygen.A lightedmatchis,rhere-

fore, what Mackie (1,974)called an inus condition-"an insufficientbut non-

redundantpart of an unnecessarbyut sufficientcondition" (p. 62; italicsin orig-

inal). It is insufficientbecausea matchcannotstart a fire without the other con-

ditions. It is nonredundantonly if it adds somethingfire-promotingthat is

uniquelydifferentfrom what the other factorsin the constellation(e.g.,oxygen,

dry leavesc) ontributeto startinga fire; afterall,it would beharderro saywhether

the matchcausedthe fire if someoneelsesimultaneouslytried startingit with a

cigarettelighter.It is part of a sufficientcondition to start a fire in combination

with the full constellationof factors.But that condition is not necessarbyecause

thereareother setsof conditionsthat can alsostart fires.

A researchexampleof an inus conditionconcernsa new potentialtreatment

for cancerI.n the late 1990s,a teamof researcherisn Bostonheadedby Dr.Judah Folkmanreportedthat a new drug calledEndostatinshranktumors by limiting

their blood supply(Folkman, 1996).Other respectedresearchercsouldnot repli-

catethe effectevenwhen usingdrugsshippedto themfrom Folkman'slab. Scien-

tists eventuallyreplicatedthe resultsafter they had traveledto Folkman'slab to

learnhow to properlymanufacturet,ransport,store,andhandlethedrugandhow

to injectit in the right locationat the right depthand angle.Oneobserverlabeled thesecontingenciesthe "in-our-hands"phenomenon,meaning"even we don't

EXPERIMENATSNDCAUSATIONI S

know which detailsare important, so it might take you sometime to work it out" (Rowe, L999, p.732). Endostatinwas an inus condition. It was insufficientcause

by itself, and its effectivenessrequired it to be embedded in a larger set of condi-

tions that were not evenfully understood by the original investigators. Most causesare more accuratelycalled inus conditions. Many factors are usu-

ally required for an effectto occur, but we rarely know all of them and how they relate to each other. This is one reason that the causal relationships we discussin this book are not deterministic but only increasethe probability that an effectwill occur (Eells,1,991,H; olland, 1,994).It also explains why a given causalrelationship will occur under some conditions but not universally acrosstime, space,hu-"r pop,rlations, or other kinds of treatments and outcomes that are more or less related io those studied. To different {egrees,all causal relationships are context dependent,so the generalizationof experimental effectsis always at issue.That is

*hy *. return to such generahzationsthroughout this book.

Effect

'Wecan better understand what an goesback at leastto the 18th-cen

effectis through tury philosopher

aDacvoiudntHeurfmacetu(aLlemwoisd,e'll

that '973'

p. SSel. A counterfactual is something that is contrary to fact. In an experiment, ie obseruewhat did happez when people received a treatment. The counterfac-

tual is knowledge of what would haue happenedto those samepeople if they si-

multaneously had not receivedtreatment. An effect is the difference betweenwhat

did

h'Wapepen and cannot

what would havehappened. actually observe a counterfactual.

Consider

phenylketonuria

(PKU),a genetically-basemd etabolicdiseasethat causesmental retardationunless

treated during the first few weeks of life. PKU is the absenceof an enzyme that

would otherwise prevent a buildup of phenylalanine, a substancetoxic to the

nervous system.Vhen a restricted phenylalanine diet is begun early and main-

tained, reiardation is prevented.In this example, the causecould be thought of as

the underlying geneticdefect, as the enzymaticdisorder, or as the diet. Each im-

plies a difierenicounterfactual. For example, if we say that a restricted phenyl-

alanine diet causeda decreasein PKU-basedmental retardation in infants who are

p'hh"ednylketonuric at birth, the counterfactual is whatever would have happened t'h.r. sameinfants not receiveda restrictedphenylalanine diet. The samelogic

applies to the genetic or enzymatic version of the cause.But it is impossible for

theseu.ry ,"-i infants simultaneously to both have and not have the diet, the ge-

netic disorder,or the enzymedeficiency. So a central task for all cause-probing research is to create reasonable ap-

proximations to this physically impossiblecounterfactual. For instance,if it were ethical to do so, we might contrast phenylketonuric infants who were given the

diet with other phenylketonuric infants who wer not given the diet but who were similar in many ways to those who were (e.g.,similar face)gender,age,socioeconomic status,health status).Or we might (if it were ethical) contrast infants who

I

6 I 1. EXPERIMEANNTSDGENERALIZCEADUSAILNFERENCE

were not on the diet for the first 3 months of their lives with those sameinfants after they were put on the diet starting in the 4th month. Neither of these approximations is a true counterfactual. In the first case,the individual infants in the treatment condition are different from those in the comparison condition; in the second case,the identities are the same, but time has passedand many changes other than the treatment have occurred to the infants (including permanent damage done by phenylalanine during the first 3 months of life). So two central tasks in experimental designare creating a high-quality but necessarilyimperfect source of counterfactual inference and understanding how this source differs from the treatment condition.

This counterfactual reasoning is fundarnentally qualitative becausecausal inference, even in experiments, is fundamentally qualitative (Campbell, 1975; Shadish, 1995a; Shadish 6c Cook, 1,999).However, some of thesepoints have beenformalized by statisticiansinto a specialcasethat is sometimescalled Rubin's CausalModel (Holland, 1,986;Rubin,"1.974,'1.977,1978,7986T1h.is book is not about statistics,so we do not describethat model in detail ('West,Biesanz,& Pitts [2000] do so and relateit to the Campbell tradition). A primary emphasisof Rubin's model is the analysisof causein experiments, and its basicpremisesare consistentwith those of this book.2 Rubin's model has also beenwidely usedto analyze causal inference in case-control studies in public health and medicine (Holland 6c Rubin, 1988), in path analysisin sociology (Holland,1986), and in a paradox that Lord (1967) introduced into psychology (Holland 6c Rubin, 1983); and it has generatedmany statisticalinnovations that we cover later in this book. It is new enough that critiques of it are just now beginning to appear (e.g., Dawid, 2000; Pearl, 2000). tUfhatis clear,however, is that Rubin's is a very general model with obvious and subtle implications. Both it and the critiques of it are required material for advanced studentsand scholars of cause-probingmethods.

CausaRl elationship

How do we know if causeand effect are related? In a classicanalysisformalized by the 19th-century philosopher John Stuart Mill, a causal relationship exists if (1) the causeprecededthe effect, (2) the causewas related to the effect,and (3) we can find no plausible alternative explanation for the effect other than the cause. These three characteristicsmirror what happens in experimentsin which (1) we manipulate the presumed cause and observe an outcome afterward; (2) we see whether variation in the causeis related to variation in the effect; and (3) we use various methods during the experiment to reduce the plausibility of other explanations for the effect, along with ancillary methods to explore the plausibility of those we cannot rule out (most of this book is about methods for doing this).

2. However, Rubin's model is not intended to say much about the matters of causal generalization that we address in this book.

EXPERTMENATNSD CAUSATTON| 7 I

Henceexperimentsarewell-suitedto studyingcausalrelationshipsN. o otherscientificmethodregularlymatchesthecharacteristicosf causalrelationshipssowell. Mill's analysisalsopointsto theweaknessof othermethods.In manycorrelational studies,for example,it is impossibleto know which of two variablescamefirst, sodefendinga causalrelationshipbetweenthemis precariousU. nderstandingthis logic of causalrelationshipsand how its key terms,suchas causeand effect,are definedhelpsresearchertso critiquecause-probingstudies.

CausationC, orrelationa,ndConfounds

A well-known maxim in research is: Correlation does not proue causation. This is so becausewe may not know which variable camefirst nor whether alternativeexplanations for the presumed effectexist. For example, supposeincome and education are correlated.Do you haveto have a high income beforeyou can aff.ordtopay for education,or do you first haveto get a good education before you can get a better paying job? Eachpossibilitymay be true, and so both needinvestigation.But until those investigationsare completed and evaluatedby the scholarly communiry a simple correlation doesnot indicatewhich variable camefirst. Correlations also do little to rule out alternative explanations for a relationship between two variables suchaseducationand income.That relationshipmay not be causalat all but rather due to a third variable (often called a confound), such as intelligenceor family socioeconomicstatus,that causesboth high educationand high income. For example, if high intelligencecausessuccessin education and on the job, then intelligent people would have correlatededucationand incomes,not becauseeducationcausesincome (or vice versa)but becauseboth would be causedby intelligence.Thus a central task in the study of experiments is identifying the different kinds of confounds that can operatein a particular researcharea and understandingthe strengthsand weaknessesassociatedwith various ways of dealingwith them

M a n i p u l a b l ae n d N o n m a n i p u l a b Cl ea u s e s

In the intuitiveunderstandingof experimentationthat mostpeoplehave,it makes senseto say,"Let'sseewhat happensif we requirewelfarerecipientsto work"; but it makesno senseto say,"Let'sseewhat happensif I changethis adult maleinto a three-year-olgdirl." And soit is alsoin scientificexperimentsE. xperimentsexplore the effectsof things that can be manipulated,suchas the doseof a medicine,the amountof a welfarecheck,the kind or amountof psychotherapyor the number of childrenin a classroomN. onmanipulableevents(e.g.,the explosionof a supernova)or attributes(e.g.,people'sagest,heir raw geneticmaterial,or their biological sex)cannotbecausesin experimentsbecausewe cannotdeliberatelyvarythem to seewhat then happens.Consequentlym, ost scientistsand philosophersagree that it is muchharderto discoverthe effectsof nonmanipulablecauses.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download