HW-Sol-5-V1 - Massachusetts Institute of Technology

6.434J/16.391J Statistics for Engineers and Scientists MIT, Spring 2006

Apr 11 Handout #13

Solution 5

Problem 1: Let a > 0 be a known constant, and let > 0 be a parameter. Suppose X1, X2, . . . , Xn is a sample from a population with one of the following densities.

(a) The beta, (, 1), density: fX (x | ) = x-1, for 0 < x < 1.

(b) The Weilbull density: fX (x | ) = axa-1 e-xa , for x > 0.

(c)

The

Pareto

density:

fX (x | ) =

, a

x(+1)

for x > a.

In each case, find a real-valued sufficient statistic for .

Solution Let X (X1, X2, . . . , Xn) be a collection of i.i.d. random variables Xi's, and let x (x1, x2, . . . , xn) be a collection of observed data.

(a) For any x, the joint pdf is

fX (x | ) = n(x1x2 ? ? ? xn)-1,

0,

if i, 0 < xi < 1 otherwise;

= n(x1x2 ? ? ? xn)-1 ? I(0,1)(x1)I(0,1)(x2) ? ? ? I(0,1)(xn) .

g(T (x) | )

h(x)

Factorization theorem implies that

T (x) x1x2 ? ? ? xn

is a sufficient statistic for .

(b) For any x, the joint pdf is

?

fX (x | ) = nan(x1x2 ? ? ? xn)a-1 e-

, n

i=1

xai

0, ?

= n e-

n i=1

xai

if i, xi > 0; otherwise;

g(T (x) | )

? an(x1x2 ? ? ? xn)a-1 ?I(0,)(x1)I(0,)(x2) ? ? ? I(0,)(xn) .

h(x)

1

Factorization theorem implies that

T (x)

n

xai

i=1

is a sufficient statistic for .

(c) For any x, the joint pdf is

fX

(x

|

)

=

n an

(x1 x2 ???xn )+1

,

0,

if i, xi > a; otherwise;

=

nan (x1x2 ? ? ? xn)+1

? I(a,)(x1)I(a,)(x2) ? ? ? I(a,)(xn) .

g(T (x) | )

h(x)

Factorization theorem implies that

T (x) x1x2 ? ? ? xn

is a sufficient statistic for .

Problem 2:

a) Let X1, X2, . . . , Xn be independent random variables, each uniformly distributed on the interval [-, ], for some > 0. Find a sufficient statistic for .

b) Let X1, X2, . . . , Xn be a random sample of size n from a normal N (, ) distribution, for some > 0. Find a sufficient statistic for .

Solution

a) For any x (x1, x2, . . . , xn), the joint pdf is given by

n

1 fX (x | ) = 2

,

0,

=

1 2

n

,

0,

if i, - xi ; otherwise; if - min(x1, . . . , xn) and max(x1, . . . , xn) ; otherwise;

=

1 2

n

I[-,)(min(x1, . . . , xn))I(-,](max(x1, . . . , xn)) ?

1

.

g(T(x) | )

h(x)

2

Factorization theorem implies that

T(x) min(x1, . . . , xn), max(x1, . . . , xn)

is jointly sufficient for .

b) For any x (x1, x2, . . . , xn), the joint pdf is given by

fX (x) = = = =

1

? e n

-

1 2

n i=1

(xi

-)2

? ? 2

1

e n

-

1 2

(

n i=1

x2i -2

n i=1

xi +n2 )

2 1

e ? ? n

-

1 2

n i=1

x2i +

n i=1

xi -

n 2

? 2

1

n

e

? n

i=1

xi

1

e ? . n

-

1 2

n i=1

x2i -

n 2

2

h(x)

g(T (x) | )

Factorization theorem implies that

T (x)

n

x2i

i=1

is a sufficient statistic for .

Problem 3: Let X be the number of trials up to (and including) the first success in a sequence of Bernoulli trials with probability of success , for 0 < < 1. Then, X has a geometric distribution with the parameter :

P {X = k} = (1 - )k-1, k = 1, 2, 3, . . . .

Show that the family of geometric distributions is a one-parameter exponential family with T (x) = x. [Hint : x = e ln x, for x > 0.] Solution Recall that the pmf of a one-parameter () exponential family is of the form

p(x | ) = h(x) e()T (x)-B(),

where x X . Rewriting the pmf of a Geometric random variable yields

P {X = x} = e(x-1) ln(1-)+ln = ex ln(1-)-(ln(1-)-ln ),

3

where x {1, 2, 3, . . . }. Thus, the geometric distribution is a one-parameter exponential family with

h(x) T (x)

X

1 x {1, 2, 3, . . . }.

() ln(1 - ) B() ln(1 - ) - ln

Problem 4: Let X1, X2, . . . , Xn be a random sample of size n from the truncated Bernoulli probability mass function (pmf),

p,

if x = 1;

P {X = x | p} =

(1 - p), if x = 0.

(a) Show that the joint pmf of X1, X2, . . . , Xn is a member of the exponential family of distribution.

(b) Find a minimal sufficient statistic for p.

Solution

(a) Let x (X1, X2, . . . Xn) denote the collection of i.i.d. Bernoulli random variables. The joint pmf is given by

? ? P {X = x | p} = px1 (1 - p)1-x1 px2 (1 - p)1-x2 ? ? ? pxn (1 - p)1-xn

=p

n i=1

xi

(1

-

p)n-

n i=1

xi

? ? ? = e(ln p)

e n

i=1

xi

[ln(1-p)][n-

n i=1

xi ]

= e[ln p-ln(1-p)]

, n

i=1

xi

+n

ln(1-p)

for x {0, 1}n. Therefore, the joint pmf is a member of the exponential family, with the mappings:

=p (p) = ln p - ln(1 - p) B(p) = -n ln(1 - p)

h(x) = 1

n

T (x) = xi

i=1

X = {0, 1}n.

(b) Let x, y {0, 1}n be given. Consider the likelihood ratio,

? ? P {X

P {X

= =

x | p} y | p}

=

e[ln p-ln(1-p)][

n i=1

xi -

. n

i=1

yi ]

4

Define a function k(x, y) h(x)/h(y) = 1, which is bounded and non-zero

for any x X and y X .

Note that x and y such that

n i=1

xi

=

n i=1

yi

are

equivalent

because

function k(x, y) satisfies the requirement of likelihood ratio partition.

Therefore, T (x)

n i=1

xi

is

a

sufficient

statistic.

Problem 5: Let X1, X2, . . . , Xm and Y1, Y2, . . . , Yn be two independent samples from N (?, 2) and N (?, 2) populations, respectively. Here, - < ? < , 2 > 0, and 2 > 0. Find a minimal sufficient statistic for (?, 2, 2).

Solution Let X (X1, X2, . . . , Xm) and Y (Y1, Y2, . . . , Yn) denote the col-

lections of random samples. The joint pdf (of Xj's and Yi's), evaluated at

x (x1, x2, . . . , xm) and y (y1, y2, . . . , yn), is given by

? ? fX,Y (x, y | ) =

1 22

m ? e-

m j=1

(xj

-?)2

22

?

? ? ? ? =

e-

1 22

m j=1

x2j -

1 2 2

n i=1

yi2 +

? 2

1 2 2

n ? e-

n i=1

(yi

-?)2

2 2

, m

j=1

xj

+

? 2

n i=1

yi-B(?,2, 2)

where B(?, 2, 2)

m 2

ln 22

+

n 2

ln 2 2

+

m?2 22

+

n?2 2 2

.

Notice that the joint pdf belongs to the exponential family, so that the

minimal statistic for is given by

T(X, Y)

m

n

m

n

Xj2, Yi2, Xj , Yi .

j=1

i=1

j=1

i=1

Note: One should not be surprised that the joint pdf belongs to the exponential family of distribution. Recall that Gaussian distribution is a member of the exponential family of distribution and that random variables, Xi's and Yj's, are mutually independent. Thus, their joint pdf belongs to the exponential family as well. Note: To derive the minimal sufficient statistic, one may alternatively consider likelihood ratio partition.

The set D0 is defined to be

D0 (x, y) Rm+n for all ?, for all 2 > 0, for all 2 > 0

fX,Y x, y | ?, 2, 2 = 0

=

(empty set).

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download