Math 145 - Elementary Statistics Final Exam Summary of ...

Math 145 - Elementary Statistics

Final Exam

Summary of Formulas

? Some Properties of Probability.

1. P (A ¡È B) = P (A) + P (B) ? P (A ¡É B).

2. P (A|B) =

P (A ¡É B)

.

P (B)

? Descriptive Statistics. Let {x1 , x2 , . . . , xn } be a sample of size n, then

P 2 1 P 2

n

n

x ? n ( x)

1X

1 X

SSxx

2

2

1. x? =

xi .

2. s =

(xi ? x?) =

=

.

n i=1

n ? 1 i=1

n?1

n?1

? Discrete Random Variable.

1. ?X = E(X) =

k

X

xi pi = x1 p1 + x2 p1 + x3 p3 + ¡¤ ¡¤ ¡¤ + xk pk .

i=1

2

2. ¦ÒX

= V ar(X) =

k

X

(xi ? ?)2 pi = (x1 ? ?)2 p1 + (x2 ? ?)2 p2 + ¡¤ ¡¤ ¡¤ + (xk ? ?)2 pk .

i=1

 

n x

? Binomial Distribution. X ¡« bin(n, p) ? P (X = x) =

p (1 ? p)n?x , for x = 0, 1, . . . , n.

x

1. E(X) = np

2. V (X) = np(1 ? p).

? (Continuous) Uniform Distribution. X ¡« unif[c, d]

1. E(X) = 12 (c + d)

? If Z =

2. V (X) =

1

12 (d

?

f (x) =

? c)2 .

1

, for c ¡Ü x ¡Ü d.

d?c

X? ? ?

¡Ì ¡« N (0, 1).

¦Ò/ n

1. The (1 ? ¦Á)100% confidence interval for ? is [X? ? Z ¦Á2 ¡Ì¦Òn , X? + Z ¦Á2 ¡Ì¦Òn ].





Z ¦Á2 ¦Ò 2

2. For a specified margin of error M , the required sample size is n =

.

M

3. To test H0 : ? = ?0 vs. H1 : ? > ?0 , reject the null if Zobs =

X?obs ? ?

¡Ì

> Z¦Á , or if the

¦Ò/ n

p?value= P (Z ¡Ý Zobs ) is ¡Ü ¦Á.

4. To test H0 : ? = ?0 vs. H1 : ? < ?0 , reject the null if Zobs < ?Z¦Á , or if the

p?value= P (Z ¡Ü Zobs ) is ¡Ü ¦Á.

5. To test H0 : ? = ?0 vs. H1 : ? 6= ?0 , reject the null if |Zobs | > Z ¦Á2 , or if the

p?value= 2 ? P (Z ¡Ý |Zobs |) is ¡Ü ¦Á.

? If t =

X? ? ?

¡Ì ¡« t(n?1)

s/ n

1. The (1 ? ¦Á)100% confidence interval for ? is [X? ? t ¦Á2 ¡Ìsn , X? + t ¦Á2 ¡Ìsn ].

X?obs ? ?

¡Ì

> t¦Á .

s/ n

< ?t¦Á .

2. To test H0 : ? = ?0 vs. H1 : ? > ?0 , reject the null if tobs =

3. To test H0 : ? = ?0 vs. H1 : ? < ?0 , reject the null if tobs

4. To test H0 : ? = ?0 vs. H1 : ? 6= ?0 , reject the null if |tobs | > t ¦Á2 .

? If

(n ? 1)S 2

¡« ¦Ö2(n?1) .

¦Ò2

2

1. To test H0 : ¦Ò 2 = ¦Ò02 vs. H1 : ¦Ò 2 > ¦Ò02 , reject the null if Xobs

=

2

(n ? 1)Sobs

> ¦Ö2¦Á .

2

¦Ò0

2

2. To test H0 : ¦Ò 2 = ¦Ò02 vs. H1 : ¦Ò 2 < ¦Ò02 , reject the null if Xobs

=

2

(n ? 1)Sobs

< ¦Ö21?¦Á .

2

¦Ò0

2

2

3. To test H0 : ¦Ò 2 = ¦Ò02 vs. H1 : ¦Ò 2 6= ¦Ò02 , reject the null if Xobs

> ¦Ö2¦Á2 or if Xobs

< ¦Ö21? ¦Á2 .

p? ? p

? If Z = q

¡Ö N (0, 1).

p(1?p)

n

1. The (1 ? ¦Á)100% confidence interval for p is [p? ? Z ¦Á2

q

p?(1?p?)

, p?

n

+ Z ¦Á2

2. For a specified margin of error M , the required sample size is n =

q

p?(1?p?)

].

n

2

Z¦Á/2

[p(1 ? p)]

M2

p? ? p0

3. To test H0 : p = p0 vs. H1 : p > p0 , reject the null if Zobs = q

p0 (1?p0 )

n

¡Ü

2

Z¦Á/2

(0.25)

M2

> Z¦Á .

4. To test H0 : p = p0 vs. H1 : p < p0 , reject the null if Zobs < ?Z¦Á .

5. To test H0 : p = p0 vs. H1 : p 6= p0 , reject the null if |Zobs | > Z ¦Á2 .

? If Z =

(X?1 ? X?2 ) ? (?1 ? ?2 )

q 2

¡« N (0, 1).

¦Ò1

¦Ò22

+

n1

n2

1. The (1 ? ¦Á)100% confidence interval for ?1 ? ?2 is (X?1 ? X?2 ) ¡À Z ¦Á2

q

¦Ò12

n1

¦Ò22

n2 .

+

2. To test H0 : ?1 ? ?2 = d0 vs. H1 : ?1 ? ?2 > d0 , reject the null if Zobs > Z¦Á .

3. To test H0 : ?1 ? ?2 = d0 vs. H1 : ?1 ? ?2 < d0 , reject the null if Zobs < ?Z¦Á .

4. To test H0 : ?1 ? ?2 = d0 vs. H1 : ?1 ? ?2 6= d0 , reject the null if |Zobs | > Z ¦Á2 .

s21

s22

n1 + n 2

2

2



(X?1 ? X?2 ) ? (?1 ? ?2 )

q 2

¡Ö t(k) , where k ¡Ö

? If t =

s1

s22

+

n1

n2

1

n1 ?1



s1

n1

+

2

1

n2 ?1



s22

n2

1. The (1 ? ¦Á)100% confidence interval for ?1 ? ?2 is (X?1 ? X?2 ) ¡À t ¦Á2

2 .

q

s21

n1

+

s22

n2 .

2. To test H0 : ?1 ? ?2 = d0 vs. H1 : ?1 ? ?2 > d0 , reject the null if tobs > t¦Á .

3. To test H0 : ?1 ? ?2 = d0 vs. H1 : ?1 ? ?2 < d0 , reject the null if tobs < ?t¦Á .

4. To test H0 : ?1 ? ?2 = d0 vs. H1 : ?1 ? ?2 6= d0 , reject the null if |tobs | > t ¦Á2 .

(X?1 ? X?2 ) ? (?1 ? ?2 )

q

¡« t(n1 +n2 ?2) , where sp =

? If t =

sp n11 + n12

s

(n1 ? 1)s21 + (n2 ? 1)s22

.

n1 + n 2 ? 2

1. The (1 ? ¦Á)100% C. I. for ?1 ? ?2 is (X?1 ? X?2 ) ¡À t ¦Á2 ? sp

q

1

n1

+

1

n2 .

2. To test H0 : ?1 ? ?2 = d0 vs. H1 : ?1 ? ?2 > d0 , reject the null if tobs > t¦Á .

3. To test H0 : ?1 ? ?2 = d0 vs. H1 : ?1 ? ?2 < d0 , reject the null if tobs < ?t¦Á .

4. To test H0 : ?1 ? ?2 = d0 vs. H1 : ?1 ? ?2 6= d0 , reject the null if |tobs | > t ¦Á2 .

p?1 ? p?2 ? (p1 ? p2 )

? If Z = q

¡Ö N (0, 1).

p1 (1?p1 )

p2 (1?p2 )

+

n1

n2



q

p?1 )

1. The (1 ? ¦Á)100% confidence interval for p1 ? p2 is (p?1 ? p?2 ) ¡À Z ¦Á2 p?1 (1?

+

n1

2. To test H0 : p1 ? p2 = 0 vs. H1 : p1 ? p2 > 0, reject the null if Zobs = q

p?2 (1?p?2 )

n2

(p?1 ? p?2 ) ? 0

p?(1?p?)

n1

Y1 + Y2

, the pooled sample proportion.

n1 + n2

3. To test H0 : p1 ? p2 = 0 vs. H1 : p1 ? p2 < 0, reject the null if Zobs < ?Z¦Á .

where, p? =

4. To test H0 : p1 ? p2 = 0 vs. H1 : p1 ? p2 6= 0, reject the null if |Zobs | > Z ¦Á2 .

+

p?(1?p?)

n2



.

> Z¦Á ,

.

? Chi-Square Statistic. The chi-square statistic is a measure of how much the observed cell counts diverge

from the expected cell counts. The formula for the statistic is

1. Goodness-of-fit Test.

X2

k

X

(observed count ? expected count)2

expected count

i=1

=

k

X

(ni ? n ? pi )2

=

n ? pi

i=1

(1)

¡« ¦Ö2k?1 .

(2)

This X 2 statistic follows approximately the ¦Ö2 distribution with k ? 1 degrees of freedom.

2. Testing equality of several proportions, independence, and homogeneity.

X2

=

=

=

X (observed count ? expected count)2

expected count

all cells

r X

c

X

i=1 j=1

r X

c

X

i=1 j=1

r

(3)

c

X X (nij ? ri ? p?j )2

(Oij ? Eij )2

=

Eij

ri ? p?j

i=1 j=1

(4)

(nij ? ri ? cj /n)2

¡« ¦Ö2(r?1)(c?1) .

ri ? cj /n

(5)

where, ri and cj are the total of the ith row and jth column, respectively. This X 2 statistic follows

approximately the ¦Ö2 distribution with (r ? 1)(c ? 1) degrees of freedom.

? ANOVA Table

Source

DF

Groups

k?1

Error

n?k

Sum of Squares

k

X

ni (x?i ? x?)2

SSG =

SSE =

i=1

k

X

(ni ? 1)s2i

i=1

Total

n?1

SST =

ni

k X

X

Mean Square

M SG =

SSG

k?1

M SE =

SSE

n?k

F

Fobs =

M SG

M SE

(xij ? x?)2

i=1 j=1

Note: SST = SSG + SSE. Under H0 , Fobs ¡« f(k?1,n?k) .

? Equation of the Least-Squares Regression Line . Suppose we have data on an explanatory variable x

and a response variable y for n individuals. The means and standard deviations of the sample data are x? and sx

for x and y? and sy for y, and the correlation between x and y is r. The equation of the least-squares regression

line of y on x is

y? = a? + b?x

with slope

b? =

(¦²xy) ? n1 (¦²x)(¦²y)

SSxy

sy

=r

=

SSxx

sx

(¦²x2 ) ? n1 (¦²x)2

(6)

and intercept

a? = y? ? b?x?

(7)

? The fitted (or predicted) values y?i ¡¯s are obtained by successively substituting the xi ¡¯s into the estimated

regression line: y? = a? + b?xi . The residuals are the vertical deviations, ei = yi ? y?i , from the estimated line.

? The error sum of squares, (equivalently, residual sum of squares) denoted by SSE, is

X

X

X

SSE =

e2i =

(yi ? y?i )2 =

[yi ? (a? + b?xi )]2

X

X

X

=

yi2 ? a?

yi ? b?

xi yi

and the estimate of ¦Ò 2 is

¦Ò? 2 = s2 =

(n ? 1)s2y (1 ? r2 )

SSE

=

.

n?2

n?2

(8)

(9)

(10)

? Linear Correlation. The linear correlation coefficient r measures the strength of the linear relationship

between the paired x? and y?quantitative values in a sample.

P

(xi ? x?)(yi ? y?)

(11)

r = pP

P

(xi ? x?)2 (yi ? y?)2

SSxy

= p

(12)

SSxx SSyy

where,

SSxx

=

SSyy

=

SSxy

=

1

(¦²x)2 = (n ? 1)s2x

n

1

¦²y 2 ? (¦²y)2 = (n ? 1)s2y

n

1

¦²xy ? (¦²x)(¦²y)

n

¦²x2 ?

(13)

(14)

(15)

? The coefficient of determination, denoted by r2 , is the amount of the variation in y that is explained by

the regression line.

r2

=

=

X

SSE

,

where, SST = SSyy =

(yi ? y?)

SST

SST ? SSE

explained variation

=

SST

total variation

1?

(16)

(17)

? Inference for b.

1. Test statistic:

¡Ì

s

b? 1 ? r2

SEb? = ¡Ì

= ¡Ì

sx n ? 1

r n?2

b? ? b

¡« t(n?2)

SEb?

2. Confidence Interval: b? ¡À t¦Á/2 SEb?

? Mean Response of Y at a specified value x? , (?Y |x? ).

1. Point Estimate. For a specific value x?, the estimate of the mean value of Y is given by

??Y |x? = a? + b?x?

2. Confidence Interval. For a specific value x?, the (1 ? ¦Á)100% confidence interval for ?Y |x? is given by

??Y |x? ¡À t¦Á/2;(n?2) SE??

s

where, SE?? = s

1

(x? ? x?)2

+

, and s = sy

n (n ? 1)s2x

r

(n ? 1)(1 ? r2 )

n?2

? Prediction of Y at a specified value x? .

1. Point Estimate. For a specific value x?, the predicted value of Y is given by

y? = a? + b?x?

2. Prediction Interval. For a specific value x?, the (1 ? ¦Á)100% prediction interval is given by

y? ¡À t¦Á/2;(n?2) SEy?

s

1

(x? ? x?)2

where, SEy? = s 1 + +

, and s = sy

n (n ? 1)s2x

r

(n ? 1)(1 ? r2 )

n?2

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download