ECE 6960: Adv. Random Processes & Applications Lecture ...

ECE 6960: Adv. Random Processes & Applications Lecture Notes, Fall 2010

Lecture 3 Today: (1) Markov inequality, (2) Chebychev inequality, (3) Chernoff bound

1 Probability bounds

Recall the Markov inequality:

P

[X

a]

EX [X a

]

.

(1)

Pro: It does not require any knowledge of the distribution. Con: it is not at all a tight bound.

Example: Markov inequality for X, a Pareto r.v. with parameters xm, . Find the probabilty that X > bxm for some b > 1 from both the Markov inequality and the distribution itself. Solution: Exact solution: From the CCDF of the Pareto r.v.,

P

[X

>

bxm]

=

xm (bxm)

=

1/b

Markov inequality solution:

From Table 4.1, EX [X] =

xm -1

for

> 1. From (1),

P

[X

bxm]

EX [X] bxm

=

(

xm - 1)(bxm)

=

b( -

1) .

(2)

See Figure 1.

1.1 Chebychev Inequality

The Chebychev inequality is the same as the Markov inequality, but replace the r.v. X with Y = (X - m)2, where m = EX [X], and replace a = b2 for b > 0. This is okay to do, because the Markov

inequality applies to any random variable, and with any positive

constant. Try this to find that:

P

[|X

-

m|

b]

VarX b2

[X

]

.

(3)

ECE 6960-002 Fall 2010

101 100

2

Exact =1.5 Markov =1.5 Exact =2.0 Markov =2.0

Probability X > bxm

10-1

10-2 2

4

6

8

10

Parameter b

Figure 1: Exact solution vs. Markov inequality for Pareto r.v. X.

Solution: Three steps:

P [X a]

EX [X] a

P (X - m)2 b2

EX (X - m)2 b2

P [|X - m| b]

VarX b2

[X

]

.

Example: Example 4.39 in Leon-Garcia, p. 182 Let X have mean m and variance 2. What is the probability of X

being more than k standard deviations away from its mean?

P

[|X

-

m|

k]

2 k22

=

1 k2

.

But, for example, for a Gaussian r.v., we know that this probability goes down more quickly than polynomially. For k = 2, for example, P [|X - m| 2] = 0.0456, while the Markov inequality gives upper bound P [|X - m| 2] 0.25. This is true, but not very tight.

1.2 Chernoff Bound

The Chernoff bound makes one substitution into the Markov inequality, but the use of the Chernoff bound in practice can be more difficult. First, let's see the replacement:

1. For some s > 0, replace X with esX .

Doing this, we have

P

esX a

EX

esX a

.

ECE 6960-002 Fall 2010

3

sx

e

a

sx = ln a x x = (ln a)/s

Figure 2: Graphical view of event {esx a}.

Using Figure 2 to see how the event {esX a} translates to a bound for X directly, we have that

P

X

1 s

log

a

EX

esX a

.

If

we

let

b

=

1 s

log a

then

a

=

ebs

(the

bs

exponential!)

and

so

P [X b] e-bsEX esX .

We will cover this in more detail in the next lecture, but for now, note that the expected value, EX esX is called the "moment generating function" (MGF) of X. Also consider the transform called

the "characteristic function" of X,

X () = EX ejX

If you write out the definition of EX ejX you will see that it is the Fourier transform of the pdf of X. The characteristic function and MGF are, in fact, the same function, just with = -js or s = j:

EX esX = EX ej(-j)sX = X (-js)

So, you can use the characteristic functions from Table 4.1 to come up with the MGF of X. (Note Wikipedia also typically lists CF and MGF for distributions which have them.) I will also denote the MGF of X as M GFX (s).

In any case, returning to the Chernoff bound,

P [X b] e-bsM GFX (s).

This is the Chernoff bound. Now, we could pick any s > 0 for which the MGF is defined. In fact, we could choose our s based on what b we are given. In particular, we can often optimize the bound, selecting s as a function of b to find the `best' bound. What does `best' mean here?

P [X b] min

s>0

e-bsM GFX (s)

.

ECE 6960-002 Fall 2010

4

This is also called the Chernoff bound. Since this is the best one available for every b, this is the Chernoff bound generally used. My advice: don't memorize this one ? just remember the original bound, and that it holds for any s > 0.

Example: Chebychev bound for Gaussian r.v. tail proba-

bilities Let X be zero-mean Gaussian with variance 2. For b 0, find P [X > a] using the Chernoff bound. Solution: Analytically,

P [X > b] = P [X/ > b/] = Q (b/)

The Chernoff bound uses the MGF of the zero-mean 2-variance Gaussian, M GFX (s) = es22/2 and is thus given by

P [X > b] e-bsM GFX (s) = e-bses22/2

This bound holds for any s > 0. You can see that the bound decreases quickly in b as e-bs (more quickly than the Chebyshev bound's 1/k2.) Finding the best upper bound means taking the lowest upper bound.

P [X > b] min e-bses22/2 = min e-bs+(2/2)s2

s>0

s>0

The minimum is found by minimizing the exponent, -bs+(2/2)s2. Taking the derivative,

0

=

s

-bs + (2/2)s2

= -b + 2s

which is true at s = b/2. So,

P [X > b]

e-b2/2+(1/2)b2/2 = exp

-

b2 22

This result is plotted in Figure 3.

ECE 6960-002 Fall 2010

5

100

Probability X > b/

10-1

10-2

10-3

10-4

10-5

Exact

10-6

Chernov Bound

0

1

2

3

4

5

Parameter b/

Figure 3: Chernoff bound for P [X > b/] vs. the exact solution, for a zero-mean Gaussian r.v. X.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download