Probability and Statistics in Microsoft Excel™
[Pages:11]aguideto
ProbabilityandStatisticsin MicrosoftExcelTM
Resourcestosupportthelearningofmathematics, statisticsandORinhighereducation.
mathstore.ac.uk TheStatisticalEducationthroughProblemSolving(STEPS) glossary stats.gla.ac/steps/glossary
ProbabilityandStatisticsinMicrosoftExcelTM
Excelprovidesmorethan100functionsrelatingtoprobabilityandstatistics.Italsohasafacilityfor constructingawiderangeofchartsandgraphsfordisplayingdata.Thisleafletprovidesaquickreference guidetoassistyouinharnessingExcel'sstatisticalcapability.Exceptwhereindicated,thefeaturesincluded hereareavailableinExcelVersions4.0andabove.Almostalltheinstructionsherealsoapplytothe spreadsheetfacilityinOpenOffice();anyslightvariationsincommands shouldbeobvioustotheuser. Excelisnotdesignedforstatisticalcomputing.Ifyourequirestatisticalanalysisbeyonddatavalidationand manipulation,tabulation,presentationandcalculationofsummarystatistics,youareadvisedtousea bespokestatisticalpackagesuchasMinitaborSPSS.
ExcelhasanAnalysisToolpakoptional"add-in"facilitythatincludesmacrosforcarryingoutmany elementarystatisticalanalyses.Theinstructionsforinstallationofthisadd-invarywiththeversionofExcel --usetheHelpfacilityinExcelforfurtherinformationonthis.Thisadd-infacilityisnotusedinthisleaflet. Therearetworeasonswhythisadd-inshouldbeusedwithcare: ? Unlikeotherspreadsheetfunctionality,whichensuresthatcalculationsautomaticallyupdateinthe
lightofchangeselsewhereintheworkbook,theoutputfromtheadd-inisnotdynamicallylinkedtothe sourcedata.Henceifanyofthedatachangetheadd-inmustberunagaintoobtainupdatedoutput. ? Outputfromtheadd-incanbemisleading(see). Thereareothercommerciallyavailableadd-insthatmakeuseofExcel'sfamiliaruserinterfacebut supplementitsstatisticalfunctionality.Examplesinclude:
Analyse-it?
R-Excel
Unistat
XLSTAT
StatTools
Usingthisleaflet
Supposeyouhaveasampleofthreedata,10.4,11.2and16.4,thatyouhaveenteredintocellsA2:A4ona
worksheet.InExcelafunction,e.g.SUM,canbeappliedtothesedatainoneoffourways:
=SUM(10.4,11.2,16.4)
=SUM(A2,A3,A4)
=SUM(A2:A4)
=SUM(x)
wherexisthenameattachedtorangeA2:A4.
Inthisleaflet,forsimplicity,wehavechosentorefertonamed ranges.Tonamearange,simplyhighlighttherangeofcells,clickin theNameBoxonthefarleftoftheFormulaBar,typeintherequired name,e.g.x,thenpressEnter.InExcel2007namescanbemanaged viaFormulas>NameManager.
Ifyouprefernottousenamestheninwhatfollowssimplyreplacethe
nameoftherange,e.g.x,bytherangeaddress,e.g.A2:A4.
DescriptiveStatistics
Assumingasampleofdatainrangex
Sampletotal,x
=SUM(x)
Samplesize,n
=COUNT(x)
Samplemean,x/n
=AVERAGE(x)
Samplevariance,s2
=VAR(x)
Samplestandarddeviation,s
=STDEV(x)
Meansquareddeviation
=VARP(x)
Rootmeansquareddeviation
=STDEVP(x)
Correctedsumofsquares,Sxx
Rawsumofsquares,x2
=DEVSQ(x) =SUMSQ(x)
Minimumvalue
=MIN(x)
Maximumvalue
=MAX(x)
Range
=MAX(x)-MIN(x)
LowerQuartile,Q1*
=QUARTILE(x,1)
Median,Q2
=MEDIAN(x)
UpperQuartile,Q3*
=QUARTILE(x,3)
Interquartilerange,IQR
=QUARTILE(x,3)-QUARTILE(x,1)
KthPercentile
=PERCENTILE(x,K%) whereKisanumberbetween0and100
Mode
=MODE(x)
*Note:Thereareseveraldifferentdefinitionsfortheupperandlowerquartiles,sothevaluescalculatedby
Excelmaynotagreewithyourtextbookorotherstatisticalcalculationtools.
Boxplot
See
GroupedFrequencyData
Assumingafrequencydistributionwithclassmidpointsstoredinrangexandfrequenciesinrangef:
Samplesize,n
=SUM(f)
Sampletotal,fx
=SUMPRODUCT(f,x)
Samplemean,fx/n
=SUMPRODUCT(f,x)/SUM(f)
Correctedsumofsquares,Sxx =SUMPRODUCT(f,x,x)-SUMPRODUCT(f,x)^2/SUM(f)
Samplevariance,s2
=(SUMPRODUCT(f,x,x)-SUMPRODUCT(f,x)^2/SUM(f))/(SUM(f)-1)
Samplestandarddeviation,s =SQRT(Samplevariance)
GraphicalRepresentations
Exceloffersawiderangeofcharttypesfordisplayingdata.Manyoftheseareover-elaborate.In particular,3-Deffectscanbemisleadingandshouldbeavoided.
InExcel2007toconstructachartforyourdata: 1. Selecttherangecontainingyourdata,includinganyroworcolumnlabels. 2. Onthemainribbon,clickontheInserttab. 3. UndertheChartsgroupoficons,selectthecharttyperequired,thenthepreferredchartsubtype. 4. UnderChartToolsonthemainribbon,usetheDesign,LayoutandFormattabstocustomisethechart.
InearlierversionsofExcel,selectthedatarangeandthenInsert>CharttoinvoketheChartWizard.
PermutationsandCombinations
Numberofdifferentcombinationsofmobjectsselectedfromnobjects
nCm
=COMBIN(n,m)
Numberofdifferentpermutationsofmobjectsselectedfromnobjects
nPm
=PERMUT(n,m)
StandardProbabilityDistributions
AssumingarandomvariableXandconstantsaandb
Binomial
Bin(n,p)
P(X=a)
=BINOMDIST(a,n,p,FALSE)
P(Xa)
=BINOMDIST(a,n,p,TRUE)
Geometric Geom(p)
P(X=a)
=BINOMDIST(1,a,p,FALSE)/a
P(Xa)
=1-BINOMDIST(0,a,p,FALSE)
Poisson
Po( )
P(X=a)
=POISSON(a,lambda,FALSE)
P(Xa)
=POISSON(a,lambda,TRUE)
Pascal
Pasc(n,p)
P(X=a) =NEGBINOMDIST(a-n,n,p)
P(Xa) =BETADIST(p,n,a-n+1)/BETADIST(1,n,a-
n+1)
Normal
N(? , 2)
f(a)
=NORMDIST(a,mu,sigma,FALSE)
P(Xa)
=NORMDIST(a,mu,sigma,TRUE)
P(aXb)
=NORMDIST(b,mu,sigma,TRUE)
-NORMDIST(a,mu,sigma,TRUE)
P(Xb)
=1-NORMDIST(b,mu,sigma,TRUE)
Exponential Expon()
f(a)
=EXPONDIST(a,theta,FALSE)
P(Xa)
=EXPONDIST(a,theta,TRUE)
P(aXb)
=EXP(-a*theta)-EXP(-b*theta)
P(Xb)
=EXP(-b*theta)
Gamma
Ga( , )
f(a)
=GAMMADIST(a,alpha,beta,
FALSE)
P(Xa)
=GAMMADIST(a,alpha,beta,TRUE)
P(aXb)
=GAMMADIST(b,alpha,beta,
TRUE)
-GAMMADIST(a,alpha,beta,
TRUE)
P(Xb)
=1-GAMMADIST(b,alpha,beta,
TRUE)
TestStatisticsforPopularSignificanceTests
Onesampletestofamean Assumingasampleofdatainrangex,drawnfromapopulationwithmean?andstandarddeviation:
H0:?=?0H1:??0
Teststatistic,z
=(AVERAGE(x)-mu0)/(sigma/SQRT(COUNT(x)))
assumingknown
Teststatistic,t
=(AVERAGE(x)-mu0)/(STDEV(x)/SQRT(COUNT(x)))
assumingunknown
Onesampletestofavariance
Assumingasampleofdatainrangex,drawnfromapopulationwithmean?andstandarddeviation:
H0:2=02H1:2> 02
Teststatistic,2
=DEVSQ(x)/sigma0^2
Twosampletestofdifferencebetweenmeans Assumingtwosamplesofdatainrangesxandy,drawnfrompopulationswithmeans?1and?2andequal
variances:
H0:?1-?2=cH1:?1-?2c Estimatetheunknowncommonstandarddeviationbythepooledestimate:
s
=SQRT((DEVSQ(x)+DEVSQ(y))/(COUNT(x)+COUNT(y)-2))
Teststatistic,t =(AVERAGE(x)-AVERAGE(y)-c)/(s*SQRT(1/COUNT(x)+1/COUNT(y)))
Twosampletestofratioofvariances Assumingtwosamplesofdatainrangesxandy,drawnfrompopulationswithvariances12and22: H0:12=22H1:12>22
Teststatistic,F =VAR(x)/VAR(y)
Chi-squaredtestofassociation
Assumingatwo-waycontingencytableofobservedfrequencies.
H0:rowfactorindependentofcolumnfactor H1:someassociationbetweenrowandcolumnfactors Thesuggestedlayoutbelowfora4x2tablecaneasilybemodifiedfortablesofothersizes.
A1: A3: C1: G3: C8: C9: C10:
=SUM(C3:D6)
=SUM(C3:D3)
=SUM(C3:C6)
=$A3*C$1/$A$1
=CHITEST(C3:D6,G3:H6)
=(COUNT(A3:A6)-1)*(COUNT(C1:D1)-1)
=CHIINV(C8,C9)
copydowntoA6 copyacrosstoD1 copyintoG3:H6
CriticalValuesandP-valuesforStatisticalTests
Therearetwoapproachestoconductingsignificancetests.Someanalystsliketocomparetheteststatistic
withthecriticalvalueforagivensignificancelevel;othersprefertocalculatetheP-valuecorrespondingto
theteststatistic.Excelcanbeusedforeithermethod.
Assumingsignificancelevel,(typically=5%or0.05):
Two-tailedz-test Uppertailcriticalvalue =NORMSINV(1-alpha/2) P-valueforgivenz =2*(1-NORMSDIST(ABS(z))) Two-tailedt-testwithvdegreesoffreedom Uppertailcriticalvalue =TINV(alpha,v) P-valueforgivent =TDIST(ABS(t),v,2)
One-tailed 2-testwithvdegreesoffreedom Uppertailcriticalvalue =CHIINV(alpha,v) P-valueforgivenchisquared=CHIDIST(chisquared,v) One-tailedF-testwithv1degreesoffreedomin thenumeratorandv2inthedenominator Uppertailcriticalvalue =FINV(alpha,v1,v2) P-valueforgivenF =FDIST(F,v1,v2)
ConfidenceLimits
Assumingdegreeofconfidence100(1-)% (e.g.for95%confidence=0.05):
One-samplestatistics,withdatainrangex
For ?(known)
Lowerlimit=AVERAGE(x)-NORMSINV(1-alpha/2)*sigma/SQRT(COUNT(x))
or =AVERAGE(x)-CONFIDENCE(alpha,sigma,COUNT(x))
For ?(unknown)
Upperlimit=AVERAGE(x)+NORMSINV(1-alpha/2)*sigma/SQRT(COUNT(x)) or =AVERAGE(x)+CONFIDENCE(alpha,sigma,COUNT(x))
Lowerlimit=AVERAGE(x)-TINV(alpha, COUNT(x)-1)*STDEV(x)/SQRT(COUNT(x))
Upperlimit=AVERAGE(x)+TINV(alpha, COUNT(x)-1)*STDEV(x)/SQRT(COUNT(x))
For 2
Lowerlimit=(DEVSQ(x)/CHIINV(alpha/2,COUNT(x))-1)
Upperlimit=(DEVSQ(x)/CHIINV(1-alpha/2,COUNT(x))-1)
Two-samplestatistics,withdataforthefirstsampleinrangex,andthesecondsampleinrangey
For ?x-?y (xknown,y known)
Lowerlimit =AVERAGE(x)-AVERAGE(y)-NORMSINV(1-alpha/2)*SQRT(sigmax^2/COUNT(x)+sigmay^2/COUNT(y))
Upperlimit =AVERAGE(x)-AVERAGE(y)+NORMSINV(1-alpha/2)*SQRT(sigmax^2/COUNT(x)+sigmay^2/COUNT(y))
For ?x-?y (xandy unknownbutassumedequal)
Estimatetheunknowncommonstandarddeviationbythepooledestimate:
s
=SQRT((DEVSQ(x)+DEVSQ(y))/(COUNT(x)+COUNT(y)-2))
Lowerlimit =AVERAGE(x)-AVERAGE(y)-TINV(alpha,COUNT(x)+COUNT(y)-2)*s*SQRT(1/COUNT(x)+1/COUNT(y))
Upperlimit =AVERAGE(x)-AVERAGE(y)+TINV(alpha,COUNT(x)+COUNT(y)-2)*s*SQRT(1/COUNT(x)+1/COUNT(y))
For x2/ y2
Lowerlimit=DEVSQ(x)/DEVSQ(y)/FINV(alpha/2,COUNT(x)-1,COUNT(y)-1)
Upperlimit(DEVSQ(x)/DEVSQ(y)/FINV(1-alpha/2,COUNT(x)-1,COUNT(y)-1)
SimpleLinearRegression
InExcelVersions5andabove,aregressionline(ortrendline)canbeaddedtoascatterplotbyright-clicking ononeoftheplottedpointsandselectingAddTrendlinefromtheshortcutmenu.Bothlinearandavariety ofnon-linearmodelsmaybefittedtothedata.Theequationofthefittedmodelmaybedisplayed, togetherwiththevalueofthecoefficientofdetermination,R2.Therearealsooptionstoextrapolatethe trendlineineitherdirection,ortoforcethetrendlinetohaveaspecificintercept.
Thetrendlineapproachispurelygraphical.Tocalculatepredictions,regressionfunctionsmustbeused.
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- effect size confidence interval and statistical
- measuring evaluation results with microsoft excel
- running a t test in excel
- how to run statistical tests in excel
- statistics with ms excel ar
- calculating and displaying regression statistics in excel
- statistical power analysis with microsoft excel normal
- probability and statistics in microsoft excel
Related searches
- probability and statistics problem solver
- probability and statistics answers pdf
- probability and statistics tutorial pdf
- probability and statistics basics pdf
- probability and statistics pdf
- elementary probability and statistics pdf
- probability and statistics pdf 4th
- probability and statistics pdf download
- probability and statistics morris pdf
- degroot probability and statistics pdf
- schaum probability and statistics pdf
- probability and statistics 4th edition