Confidence Intervals Confidence interval for sample mean

Confidence Intervals

The CLT tells us: asthesamplesizenincreases,thesamplemeanisapproximatelyNormalwith mean andstandarddeviation Thus,wehaveastandardnormalvariable

IftheunderlyingpopulationisNormallydistributed,wedon'tneedCLTorlarge samplesizeforthesamplemeantobeNormallydistributed?normalityis guaranteed.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

1

Confidenceintervalforsamplemean

Becausetheareaunderthestandardnormalcurvebetween?1.96and1.96is.95, we know: This is equivalent to:

whichcanbeinterpretedastheprobabilitythattheinterval

includesthetruemean is95%.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

2

Confidenceintervalforsamplemean

The interval

isthuscalledthe95%confidenceintervalforthemean. Thisintervalvariesfrom sampletosample,asthesamplemeanvaries. Sotheintervalitselfisarandom interval:itsboundsarerandom variables.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

3

Confidenceintervalforsamplemean

TheCIintervaliscenteredatthesamplemeanandextends 1.96 toeachsideofthesamplemean.

Thustheinterval'swidthis2(1.96) andisnotrandom;onlytheinterval boundaries are random

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

4

BasicPropertiesofConfidenceIntervals

Foragivensample,theCIcanbeexpressedeitheras oras

Aconciseexpressionfortheintervalisx 1.96 where?givestheleftendpoint(lowerlimit)and+givestherightendpoint(upper limit).

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

5

Interpreting a Confidence Level

Westartedwithanevent(thattherandom intervalcapturesthetruevalue) whoseprobabilitywas.95 Itistemptingtosaythat lieswithinthisfixedintervalwithprobability0.95. isaconstant(unfortunatelyunknowntous).Itisthereforeincorrecttowritethe statement

P( liesin(a,b))=0.95 --since eitherisin(a,b)orisn't. Basically, isnotrandom (it'saconstant),soitcan'thaveaprobabilityassociated withitsbehavior.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

6

Interpreting a Confidence Level

Instead,acorrectinterpretationof"95%confidence"reliesonthelong-runrelative frequencyinterpretationofprobability. TosaythataneventAhasprobability.95istosaythatifthesameexperimentis performedoverandoveragain,inthelongrunAwilloccur95%ofthetime. Sotherightinterpretationistosaythatinrepeatedsampling,95%ofthe confidenceintervalsobtainedfrom allsampleswillactuallycontain .Theother 5%oftheintervalswillnot.

Interpreting a Confidence Level

Example:theverticallinecutsthemeasurementaxisatthetrue(butunknown) valueof.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

7

One hundred 95% CIs (asterisks identify intervals that do not include ).

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

8

Interpreting a Confidence Level

Noticethat7ofthe100intervalsshownfailtocontain. Inthelongrun,only5%oftheintervalssoconstructedwouldfailtocontain. Accordingtothisinterpretation,theconfidencelevelisnotastatementaboutany particularinterval,eg(79.3,80.7). Insteaditpertainstowhatwouldhappenifaverylargenumberoflikeintervals weretobeconstructedusingthesameCIformula.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

9

OtherLevelsofConfidence

Probabilityof1? isachievedbyusingz/2 inplaceof1.96

P(?z/2 Z < z/2) = 1 ?

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

10

OtherLevelsofConfidence

A100(1?)%confidenceintervalforthemean whenthevalueof isknownis given by

or,equivalently,by

TheformulafortheCIcanalsobeexpressedinwordsas Pointestimate (z critcazl value) (standr eroc). riticalvalue)(z critcasl value) (stantdr ero)a. ndarderror).

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

11

Example

Asampleof40unitsisselectedanddiametermeasuredforeachone.Thesample meandiameteris5.426mm,andthestandarddeviationofmeasurementsis0.1mm.

Let'scalculateaconfidenceintervalfortrueaveragediameterusingaconfidencelevel of90%.Thisrequiresthat100(1?)=90,from which =.10.

Usingqnorm(0.05) z/2=z.05=1.645 (correspondingtoacumulativez-curveareaof.95).

Thedesiredintervalisthen

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

12

Intervalwidth

Sincethe95%intervalextends1.96 x,thewidthoftheintervalis2(1.96)

to each side of = 3.92

Similarly,thewidthofthe99%intervalis(usingqnorm(0.005))

2(2.58)

= 5.16

Wehavemoreconfidencethatthe99%interval includesthetruevaluepreciselybecauseitiswider.

Thehigherthedesireddegreeofconfidence,thewidertheresultingintervalwillbe.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

13

Samplesizecomputation

Foreachdesiredconfidencelevelandintervalwidth,wecandeterminethe necessarysamplesize. Example:AresponsetimeisNormallydistributedwithstandarddeviation25 millisec.Anewsystem hasbeeninstalled,andwewishtoestimatethetrue averageresponsetime forthenewenvironment. Assumingthatresponsetimesarestillnormallydistributedwith =25,what samplesizeisnecessarytoensurethattheresulting95%CIhasawidthof(atmost) 10?

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

14

Example

Thesamplesizenmustsatisfy

cont'd

Rearranging this equation gives

= 2 (1.96)(25)/10 = 9.80 So

n = (9.80)2 = 96.04 Sincenmustbeaninteger,asamplesizeof97isrequired.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

15

Unknownmeanandvariance

Weknowthat -aCIforthemean ofanormaldistribution -alarge-sampleCIfor foranydistribution

withaconfidencelevelof100(1?)%is:

Apracticaldifficultyisthevalueof,whichwillrarelybeknown.Insteadwework with the standardized variable

WherethesamplestandarddeviationShasreplaced.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

16

Unknownmeanandvariance

Previously,therewasrandomnessonlyinthenumeratorofZbyvirtueof ,the estimator. Inthenewstandardizedvariable,both andSvaryinvaluefrom onesampleto another.

ThusthedistributionofthisnewvariableshouldbewiderthantheNormaltoreflect theextrauncertainty.Thisisindeedtruewhennissmall. However,forlargenthesubstitutionofSfor addslittleextravariability,sothis variablealsohasapproximatelyastandardnormaldistribution.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

17

ALarge-SampleIntervalfor

Ifnissufficientlylarge,thestandardizedvariable

hasapproximatelyastandardnormaldistribution.Thisimpliesthat

isalarge-sampleconfidenceintervalfor withconfidencelevelapproximately 100(1?)%. Thisformulaisvalidregardlessofthepopulationdistribution.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

18

ALarge-SampleIntervalfor

Inwords,theCIis pointestimateof (zcriticalvalue)(estimatedstandarderrorofthemean).

Generallyspeaking,n>40willbesufficienttojustifytheuseofthisinterval. ThisissomewhatmoreconservativethantheruleofthumbfortheCLTbecauseof theadditionalvariabilityintroducedbyusingSinplaceof.

___________________________________________________________________________________

CopyrightProf.VanjaDukic,AppliedMathematics,CU-Boulder

STAT 4000/5000

19

Smallsampleintervalsforthemean

?TheCIfor presentedinearliersectionisvalidprovidedthatnislarge ? Ruleofthumb:n>40 ? Theresultingintervalcanbeusedwhateverthenatureofthepopulation distribution.

?TheCLTcannotbeinvokedwhennissmall ? Needtodosomethingelsewhenn ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download