Www.cheat-sheets.org
.TLI\
TSTICS FOR INTRODUGTORY COURSES
J STATISTICS - A set of tools for collecting, oreanizing, presenting, and analyzing numerical facts or observations.
I .Descriptive Statistics - procedures used to organize and present data in a convenient, useable. and communicable form.
2.Inferential Statistics - proceduresemployed to arrive at broader generalizations or inferences from sample data to populations.
-l STATISTIC - A number describing a sample characteristic. Results from the manipulation of sample data according to certain specified procedures.
J DATA - Characteristics or numbers that are collectedby observation.
J POPULATION - A complete set of actual or potential observations.
J PARAMETER - A number describing a population characteristic; typically, inferred from samplestatistic.
f SAMPLE - A subset of the population selectedaccording to some scheme.
J RANDOM SAMPLE - A subset selected in such a way that each member of the population has an equal opportunity to be selected. Ex.lottery numbers in afair lottery
J VARIABLE - A phenomenon that may take on different values.
Shows the number of times each observation occurs when the values ofa variable are arranged in order according to their magnitudes.
J il {il, I a rrI.)'A .l b]|, K I 3artl LQ
x fx 100 1 83
99 1 ut 98 0 85 gl 0 86
96 11 87 95 0 88 94 0 89 93 I 92 0 91
t 11 11111 1 o 1 1111111 111 11 1
x
f
74 11f
75 1111
76 1 1
77 111
7A I
79 1 1
80 1
81 11
82 I
xt 65 o 66 1 67 11 68 1 69 111 70 1111 71 0 72 11 73 111
II GROTJPEFDREOUENCYEilSTRIBUTION
- A frequency distribution in which the values
ofthe variable have been grouped into classes.
CLASS 98-100
f CLASS t
f MEAN -The ooint in a distributionof measurements aboutwhich the summeddeviationsare equalto zero.
Average value of a sample or population.
POPULATION MEAN
p: +!,*,
SAMPLE MEAN
o:#2*,
Note: The meanls very sensltlveto extrememeasurementsthat are not balancedon both sides.
I WEIGHTED MEAN - Sum of a setof observations
multiplied by their respectiveweights, divided by the
sumof theweights: WEIGHTED MEAN
9, *, *, -L-
w h e r ex r ,:
w e i g h t , ' x ,-
,\r*'
o b s e r v a t i o nG;
:
n u m b e ro f
o b s e r v a i i o ng r d u p s . ' C a l c u l a t efdr o m a p o p u l a t i o n .
sample.or gr6upingsin a frequencydistribution.
Ex. In the FrequencVDistribution below, the meun is 80.3: culculatbd by-using frequencies for the wis. Whengrouped, use clossmidpointsJbr xis.
J MEDIAN - Observationor potenliaol bservationin a set that divides the set so that the samenumber of observationslie on eachside of it. For an odd number of values.it is the middle value; for an evennumber it is the averageof the middle two.
Ex. In the Frequency Distribution table below, the median is 79.5.
f MODE - Observationthat occurs with the greatest tiequency. Ex. In the Frequency Distributioln nble below.the mode is 88.
GROUpITG OF DATA
tr CUMULATUEFREOUENCBYISTRI.
BUTION -A distributionwhichshowsthetotal frequencythroughthe upperreal limit of eachclass.
tr CUMUIATIVE PERCENTAGE DISTRI. BUTION-A distributionwhichshowsthetotal percentagethroughthe upperreal limit of eachclass.
I il {.ll lNl.l'tlz
!I! llrfGl:
CLASS
f I
65-67
3
6&70
8
71-73
5
7+76
9
Tt-79
6
80-82
4
83-85
8
86-88
8
89-91
6
92-g
1
95-97
2
9&100
2
Cum f 3 11 16 25 31 35 43 51 57 58 60 62
"
4.84
17.74 25.81 40.32 50.00 56.45 69.35 82.26 91.94 93.55 96.77 100.00
O SUM OF SOUARES fSSr- Deriationstiom
themeans. quareadndsummed:
P o p u l a t i o n S S : I ( X- li. r x ) ' o rI x i ',-
(t I r,), N
S a m p l eS S : I ( x i
-x)2or
_ r, Ixi2---
\,)2
O VARIANCE - The averageof squaredifferencesbetweenobservationasndtheir mean.
POPULANONVARIANCESAMPLEVARIANCE
VARIANCESFOH GBOUPEDDATA
POPUIATION
SAMPLE
^{G-'{G
o2:*i t,(r,-p)t
s2=;1i
tilm'-x;2
lI ;_r
t=1
D STANDARD DEVIATION - Squareroot of the variance: Ex. Pop. S.D. o -
D BAR GRAPH - A form of graph that uses bars to indicate the frequency of occurrence of observations. o Histogram - a form of bar graph used rr ith interval or ratio-scaled variables. - Interval Scale- a quantitative scale that permits the use of arithmetic operations.The zero point in the scale is arbitrary. - R.atio Scale- same as interval scaleexcepl that there is a true zero point.
D FREOUENCY CURVE - A form of graph representing a frequency distribution in the form of a continuous line that traces a histogram. o Cumulative Frequency Curve - a continuous line that traces a histogram where bars in all the lower classes are stacked up in the adjacent higher class. It cannot have a negative slop. o Normal curve - bell-shaped curve. o Skewed curve - departs from symmetry and tails-off at one end.
n
Y
I
U
fzi
)
N O R M A LC U R V E
15
^/T\
10 0 -att?
./
-t
\ \
S K E W E DC U R V E
15
-- \
/
\
10
-/ LEFT
\
J-
\
0
Probabioliitfy'eonft'ol ccurrenceA^tnat=t @-Number of outcomafamring EwntA
D SAMPLE SPACE- All possibleoutcomesof an experiment.
N TYPE OF EVENTS
o Exhaustive - two or more events are said to be exhaustive
if all possible outcomes are considered. Symbolically, P (A or B or...) - l.
rNon-Exhausdve -two or more events are said to be non-
exhaustive if they do not exhaust all possible outcomes. rMutually Exclusive - Events that cannot occur simultaneously:p(A andB) = 0; andp (A or B) = p (A) + p (B).
Ex. males, females oNon-Mutually Exclusive - Event-s that can occur
simultaneously:p (A orB) = P(A) +p(B) - p(A andB)'
&x. males, brown eyes.
Slndependent by occurrence
or
Events whose probability is unaffected nonoccurrence of each other: p(A lB) =
p(A); ptB In)= p(e); and p(A and B) = p(A) p(B).
Ex. gender and eye color SDependent - Events whose probability changes deoendlns upon the occurrence or non-occurrence ofeach other: p{.I I bl dilfers lrom AA): p(B lA) differs from p(B); andp(A andB): p(A) p(BlA): p(B) AAIB)
Ex. rsce and eye colon
C JOINT PROBABILITIES - Probabilitythat2 ot m o r e e v e n t so c c u r s i m u l t a n e o u s l y .
tr
tiMonAaRl PGrIoNbAabLilitPieRs=OsBuAmBmILaItTioIEnoSf
or Uncondiprobabilities'
D CONDITIONAL PROBABILITIES - Probability of I given the existence of ,S,written, p (Al$.
fl EXAMPLE- Given the numbers I to 9 as observations in a sample space: .Events mutually exclusive and exhaustive' Example: p (all odd numb ers); p ( all eu-ne nurnbers) .Evenls mutualty exclusive but not exhaustiveExample: p (an eien number); p (the numbers 7 and 5) .Events ni:ither mutually exclusive or exhaustive-
Example: p (an even number or a 2)
RANDOM VARIABLES
A ma'popnilnvg or function that assignsone and one-numerical value to each
outcome in an exPeriment.
tl DISCRETE RANDOM VARIABLES - In-
volvesrulesor probabilitymodelsfor assign-
ing or generatingonly distinctvalues(notfrac-
tionalmeasurements). C BINOMIAL DISTRIBUTION - A model
for the sumof a seriesof n independenttrials
wheretrial resultsin a 0 (failure)or I (suc-
cess).Ex. Coin to "t
p(r)=(!)n'l-trl"-'
wherep(s) is theprobabilityof s succesisn n
trials with a constantn probability per trials,
a- "n-
d w"
h' -e' -r
e (t
,1 s/
\
=s,!
n! (n-
s
)
!
Binomial mean: !: nx
Binomial variance: o': n, (l - tr)
A s n i n c r e a s e s ,t h e B i n o m i a l a p p r o a c h e st h e
Normal distribution.
D HYPERGEOMETRIC DISTRIBUTION A model for the sum of a series of n trials where
each trial results in a 0 or I and is drawn from a
small population with N elements split between
N1 successesand N2 failures.Then the probabil-
ity of splitting the n trials between xl successes
and x2 failures is:
Nl!
{_z!
p(xlandtrr:W
4tlv-r;lr 't
Hypergeometric mean: pt :E(xi - +
andvariance:o2: ffit+][p]
D POISSON DISTRIBUTION - A model for the number of occurrences of an event x :
0 , 1 , 2 , . . . ,w h e n t h e p r o b a b i l i t y o f o c c u r r e n c e
is small, but the number of opportunities for t h e o c c u r r e n c ei s l a r g e ,f o r x : 0 , 1 , 2 , 3 . . . .a n d )v> 0 . otherwise P(x) =.0.
e$t=ff
P o i s s o nm e a n a n d r a r i a n c e : , t .
tr LEVEL OF SIGNIFICANCE-Aprobabilin valueconsidererdareinthesamplindgistribution. specifiedunderthenull hypothesiws hereoneis willing to acknowledgteheoperationof chance mon significancelevelsare 170, 50 ,l0o . Alpha (a) level : the lowestleve for which the null hypothesis can be rejected.
The significanceleveldeterminesthecritical region. [| NULL HYPOTHESIS (flr) - A statement
that specifies hypothesized value(s) for one or more of the population parameter. lBx. Hs= a coin is unbiased.That isp : 0.5.] tr ALTERNATM HYPOTHESIS (.r/1) - A statement that specifies that the population parameter is some value other than the one
specified underthe null trypothesis.[Ex. I1r: a coin is biased That isp * 0.5.1
I. NONDIRECTIONAL HYPOTHESIS an alternative hypothesis (H1) that statesonll that the population parameter is different from the one ipicified under H 6. Ex. [1f lt + !t0 Two-TailedProbability Valueis employedwhen
the alternativehypothesisis non-directional. 2. DIRECTIONAL HYPOTHESIS - an
alternative hypothesis that statesthe direction rn which the population parameter differs fiom the one specified under 11* Ex. Ilt: Ir > pnr-trHf lr ' t1
One-TailedProbabilityValueis employedu'hen
the alternativehypothesisis directional. D NOTION OF INDIRECT PROOF - Stnct
interpretation ofhypothesis testing revealsthat thc'
null hypothesis canneverbeproved. [Ex. Ifwe toi. a coin 200 times andtails comesup 100 times. it is no guarantee that heads will come up exactly hali
the time in the long run; small discrepancies migfrt
exist. A bias can exist even at a small magnitude. We can make the assertion however that NO BASIS EXISTS FOR REJECTING THE HYPOTHESIS THAT THE COIN IS
UNBIASED . (The null hypothesisis not reieued.
When employing the 0.05 level of significa
reject the null hypothesis when a given res
occurs by chance5% of the time or less.]
] TWO TYPES OF ERRORS -Type1Error (Typea Error)= therejectionof 11,whenit is actuallytrue.Theprobabilityof a type 1 error is givenby a. -TypeII Error(TypeBError)=Theacceptance offl, whenit is actuallyfalse.Theprobabilin of a typeII erroris givenby B.
fl SAMPLING DISTRIBUTION - A theoretical probability distribution of a statisticthat would iesult from drawing all possible samplesof a
given size from some population.
THE STAIUDARDEBROR OF THE MEAN
A theoretical standard deviation of sample mean of a
given sample si4e, drawn from some speciJied popu-
lation.
DWhen based on a very large, known population, the
s t a n d a r de r r o r i s :
6" r__ _ o ^ ln
EWhen estimated from a sample drawn from very large
population, the standard error is:
O =^ =t -
S
'fn
lThe dispersion of sample means decreasesas sample size is increased.
For continuo us t'a ri u b Ies. .fi'eq uen t'i es u re e.tp ressed in terms o.fareus under u t'ttt.re.
D CONTINUOUS RANDOM VARIABLES - Variable that may take on any value along an uninterrupted interval of a numberline.
D NORMAL DISTRIBUTION - bell cun'e; a distribution whose valuescluster symmetrically aroundthe mean(alsomedianandmode).
(x-P)212o2 f ( x ) = - 1o",t'2x
wheref (x): frequency.at.a givenrzalue o : s t a n d a r dd e v i a t l o no f t h e distribution lt : aapppprrooxxiimmaatteellyI2y.171118q3 p : the meanof the distribution x : any scorein the distribution
D STANDARD NORMAL DISTRIBUTION - A normalrandomvariableZ. thathasa mean of0. andstandardeviationof l.
Q Z-VALUES- The numberof standarddeviationsa specificobservatiolniesfromthemean: ': x- 11
(for sample mean X)
rlf x 1,X2, X3,...xn , is a simple random sample of n elements from a large (infinite) population, with mean
mu(p) and standard deviation o, then the distribution of
T takes on the bell shaped distribution of a normal
random variable asn increasesandthe distribution ofthe
ratio:
7-!
6l^J n
approachesthe standard normal distribution asn goes to'infinity. In practice.a normal approximation is
acceptable for samples of 30 or larger.
Percentage Cumulative Distribution
for selected Z values under a normal curye
Z-value -3 -2 -l 0 +1 +2 +3 PercentifeScore o-13 2.2a 15.87 50.00 a4.13 97.72 99.a7
Critical when
region
for rejection
u : O-O7. two-tailed
of Hs test
NBIASEDNESS- Propertyof a reliableesimator beins estimated. o Unbiased Estimate of a Parameter - an estimate that equalson the averagethe value ofthe parameter.
Ex. the sample mesn is sn unbissed estimator of the population mesn. . Biased Estimate of a Parameter - an estimate that does not equal on the averagethe value ofthe parameter.
Ex. the sample variance calculated with n is a bissed estimator of the population variance, however, x'hen calculated with n-I it is unbiused. J STANDARD ERROR - The standarddeviation
of the estimator is called the standard error. Er. The standarderror forT's is. o: X = "/F This has to be distinguished from the STAND.A,RDDEVIATION OF THE SAMPLE:
' The standarderror measuresthe variability in the Ts around their expectedvalue E(X) while the stanJard deviation of the samplereflectsthe variability rn the samplearound the sample'smean (x).
-
I I
-r r
L'sBDwHEN toN IS UNK
THE STANDARDDEvIANOWN-Useof Student'sr.
f Wheno isnotknowni,tsvalueisestimatefdrom
F samoledata.
fjm' t-ratio- the ratio employedil thq.testingof
Ivpotheses or determiningthseignificancebaf
Vrri'erence betweenmeafrsltwo--samplecase)
inrolving a samplewith a i-distribuiion.The
tbrmulaTs:
\ F where p : population mean under H6
SX
r = . sl r l
"n6 oDistribution-symmetrical distribution with a mean of zero lnd standard deviation that annroachesone as degreesoffreedom increases ' i . i . a p p r o a c h e st h e Z d i s t r i b u t i o n ) .
. A , s s u m p t i o na n d c o n d i t i o n r e q u i r e d i n r\suming r-distribution:Samplesare diawn from a norm-allv distributed population and o rpopulation standarddeviatiori) is unknown.
o Homogeneity of Variance- If.2 samples are b e r n cc o m o a r e d t h e a s s u m p t i o ni n u s i n g t - r a t i o r' th?t the variances of the populatioi's from * h e r e t h e s a m p l e sa r e d r a w n a r e e q u a l .
o E s t i m a t e d6 X - , - X ,( t h a t i s s x , - F r ) i s b a s e do n thc unbiasedestimaieof the pofulaiion variance.
o Degreesof Freedom (dJ\-^thenumber of values that are free to vary after placing certain
restrictions on the data.
Example. The sample (43,74,42,65) has n = 4. The sum is 224 and mean : 56. Using these 4 numbers
and determining deviationsfrom the mean, we'll have J deviations namely (-13,18,-14,9) which sum up to :ero. Deviations from the mean is one restriction we have imposed and the natural consequence is that the sum ofthese deviations should equal zero. For this to happen, we can choose any number but our freedom to choose is limited to only 3 numbers because one is
restricted by the requirement that the sum ofthe de-
viations should equal zero. We use the equality:
(x,-x) +(x2-9 +ft t-x) +(xa--x:) 0 I Sogiven q meanof 56,iJ'thefirst 3 observqtionsure 43, 74,und 42, the last observationhus to be 65.This single restriction in this case helps us determine df, Theformula is n lessnumber of restrictions.In this t'ase,it is n-l= 4-l=3df
. _/-Ratiois a robust test-This meansthat statistical inferencesarelikely valid despitefairly largedepartures from normality in the populationdistribution. If normality of populationdistributionis in doubt,it is wise
to increasethe samplesize.
tr USED WHEN THE STANDARD DEVIATION IS KNOWN: Wheno is knownit is possible to describethe form of the distribution of thesamplemeanasaZ statisticT. hesamplemust be drawn from a normal distributionor havea samplesize(n) of at least30.
,' 6.== r - ! whereu : populationmean(either nro#rf or hypothesizedunderHo) andor = o/f,. o Critical Region - the portion of the areaunder thecurvewhich includesthosevaluesof a statistic thatleadto therejectionofthe null hypothesis.
- The most often usedsignificancelevelsare 0.01,0.05,and0.L Foraone-tailedtesutsingzstatistic,thesecorrespondto z-valuesof 2.33, 1.65,and 1.28respectivelyF.ora two-tailedtest, thecriticalregionof 0.01is split into two equal outerareasmarkedby z-valuesof 12.581.
Example 1. Given a population with lt:250 and o: S0,whatis theprobabili6t of drawing a sampleof n:100 valueswhosemean (x) is at least255?In this case,2=1.00.Looking atThble A, the given areafor 2:1.00 is 0.3413.Tb its right is 0.1587(=6.5-0.i413o) r 15.85%.
Conclusion: there are spproximately 16 chancesin 100 of obtaining a samplemean : 255from this papulation when n = 104.
Example 2. Assume we do not know the population me&n. However, we suspect that it may have been selectedfrom a population with 1t= 250 and 6= 50,but weare not sure. The hypothesis to be tested is whether the sample mean wasselectedfrom this populatian.Assume we obtainedfrom a sample (n) of 100, a sample ,neen of 263. Is it reasonable to &ssantethat this sample was drawn from the suspectedpopulation?
| .H o'.1=t 250(thattheactualmeanof thepopulationfrom which the sampleis drawnis equal to 250)Hi [t not equalto 250 (the alternative hypothesiSis that it is greaterthan or lessthan 250,thusa two-tailedtest).
2. e-statisticwill be usedbecausethe population o is known.
3.Assumethesignificancelevel(cr)to be0.01. Looking at Table A, we find that the area beyond a z of 2.58is approximately0.005.
TorejectH6atthe0.01levelof significancet,}reabsolutevalueof the obtainedz mustbe equalto or rgersepaotenrtdhianngtolzs6a.9m1oplrl2em.5e8a.Hn:er2e6th3eisv2a.l6u0e.of z cor-
tr CONCLUSION-Sincethisobtainedz fallswithin thecriticalregion,wemayrejecHt oatthe0.01level of significance.
tr CONFIDENCE INTERVAL- Interval within which we may consider a hypothesis tenable. Common confidence intervals are 90oh,95oh, and 99oh. Confidence Limits: limits definins the confidence interval.
(1- cr)100% confidence interval for rr:
ii, *F t l-il.l, ................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- basic statistics formulas integral table
- formulas from epidemiology kept simple 3e chapter
- statistical methods formula sheet for final exam
- frequently used statistics formulas and tables
- tables and formulas for moore basic practice of statistics
- statistics cheat sheet blast analytics marketing
- mat1272 statistics formula sheet city tech
- ti 83 84 calculator the basics of statistical functions
Related searches
- scrabble cheat sheets 2019
- stats cheat sheets and formulas
- icd 10 cheat sheets free
- conversion cheat sheets for measurement
- grammar cheat sheets for adults
- bible cheat sheets printable
- algebra cheat sheets high school
- cpr cheat sheets printable
- cheat sheets for python
- cpt codes cheat sheets free
- free football cheat sheets printable
- cheat sheets for acls