Statistical methods, Formula sheet for final exam

[Pages:5]Statistical methods, Formula sheet for final exam

Combinatorics

? Number of ways to choose k out of n objects:

n n(n - 1) ? ? ? (n - k + 1)

n!

=

=

k

k!

(n - k)!k!

Basic probability

Always true: ? P (Ac) = 1 - P (A)

? P (A B) = P (A) + P (B) - P (A B)

? P (A B) = P (A)P (B|A) ? P (Ac) = 1 - P (A)

? A, B mutually exclusive: P (A B) = P (A) + P (B)

? A, B independent: P (A B) = P (A)P (B)

? Conditional probability of A given B: P (A B)

P (A|B) = P (B)

LTP and Bayes' theorem

P (A) = P (A|B)P (B) + P (A|Bc)P (Bc)

P (A|B)P (B) P (B|A) =

P (A|B)P (B) + P (A|Bc)P (Bc)

Discrete random variables

? Pmf: p(x) = P (X = x)

Special distributions: ? X bin(n, p): p(k) = n pk(1 - p)n-k, k = 0, 1, ..., n (# of successes)

k E[X] = np ? X geom(p): p(k) = (1 - p)k-1p, k = 1, 2, ... (wait for first success)

1 E[X] =

p

Continuous random variables

? Pdf: f (x) = F (x), x R

x

? Cdf: F (x) = f (t)dt, x R

-

Special distributions:

1

? X unif [a, b]: f (x) =

, a x b (choose "randomly")

b-a

a+b E[X] =

2

? X exp(): f (x) = e-x, x 0 (memoryless)

1 E[X] =

? X N (0, 1): (x) = 1 e-x2/2, x R

2 ? X N (?, 2) : Z = X - ? N (0, 1)

Expected value

? E[X] = xxp(xk) if X is discrete with range {x1, x2, ...}

k

? E[X] = xf (x)dx if X is continuous

-

? E[g(X)] = g(xk)pX(xk)

k

? E[g(X)] = g(x)fX(x)dx -

Variance

? Var[X] = E[(X - ?)2] = E[X2] - (E[X])2

? Standard deviation: = Var[X]

Sums of random variables

? X and Y independent, a and b constants:

E[aX + bY ] = aE[X] + bE[Y ] Var[aX + bY ] = a2Var[X] + b2Var[Y ] ? X1, ..., Xn independent random variables with the same distributions, mean ? and varance 2, Sn = X1 + ... + Xn. ? E[Sn] = n? and Var[Sn] = n2 ? Central Limit Theorem: Sn is approximately N (n?, n2) and X? is approximately N (?, 2/n).

Estimators

? Unbiased: E[] =

? We want Var[] to be as small as possible

? Sample mean X? , unbiased for mean ?, Var[X? ] = 2/n

? Sample variance s2 = 1 n-1

? Standard error: Var[]

n

Xk2 - nX? 2 , unbiased for variance 2

k=1

Confidence intervals

? For ? in N (?, 2) where 2 is known:

?

=

X?

?

z

(q)

n

Use standard normal distribution, (z) = (1 + q)/2.

? For ? in N (?, 2) where 2 is unknown:

?

=

X?

?

t

s

(q)

n

Use t distribution, = 1 - (1 + q)/2, = n - 1.

? For unknown proportion p:

p(1 - p)

p=p?z

(q)

n

? For two unknown proportions p1 and p2:

p1 - p2 = p1 - p2 ? z

p1(1 - p1) + p2(1 - p2) (q)

n1

n2

In both cases, z is such that (z) = (1 + q)/2.

Estimation methods

1. The maximum likliehood estimator (MLE) maximizes the likelihood function:

n

L() = f(Xk)

k=1

To find maximum, (i) take logarithm, (ii) differentiate w.r.t. and set = 0.

2. The rth moment and rth sample moment are:

?r = E[Xr]

and

?r

=

1 Xr n

The method of moments estimator (MOME) expresses the parameter as a

function of moments, = g(?1, ..., ?r) and estimates it with the same funtion of the sample moments, = g(?1, ..., ?r). Start with the first moment, if it is not enough, go on to the second, and so on.

Linear regression

? Model: Y = a + bx + , N (0, 2) ? Observations: (x1, Y1), ..., (xn, Yn) where Yk N (a + bxk, 2) ? Notation:

? Estimators:

n

Sx =

xk,

k=1

n

Sxx =

x2k ,

k=1

n

SY = Yk

k=1

n

SxY = xkYk

k=1

b

=

nSxY - SxSY nSxx - Sx2

a = Y? - bx?

Estimated regression line: y = a + bx

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download