Statistics 512 Notes 8



Statistics 512 Notes 8: The Monte Carlo Method

The t-test

Let [pic] be iid with mean [pic]and unknown distribution. Consider the hypotheses

[pic] vs. [pic]

If the distribution of the [pic]is normal (with unknown variance), then a test with exact size 0.05 is to use the test statistic

[pic].

and the rejection region [pic][where [pic]is the [pic]quantile of the t-distribution with n-1 degrees of freedom, i.e., [pic]]. This is called the t-test.

When the distribution of [pic]is normal, the test has exact size [pic]because when [pic], [pic]has a t-distribution with n-1 degrees of freedom.

When the distribution of [pic]is not normal, the test does not necessarily have exact size 0.05. However, as for large n, [pic]

because of the Central Limit Theorem so that the t-test has approximate size 0.05 for large samples for any distribution of [pic].

Note the difference between the rejection rule [pic]and [pic]. The large sample [pic]has approximate size [pic], while [pic]has exact size [pic]. Of course, we now have to assume that [pic]has a normal distribution. In practice, we may not be willing to assume that the population is normal. In general t-critical values are larger than z critical values (i.e., [pic]) so the t-test is conservative relative to the large sample test. So in practice, many statisticians often use the t-test even if they do not believe the data is normally distributed. Note that [pic].

How well does the t-test work in moderate sized samples when the data is not normal, i.e., what is its true size in moderate sized samples?

Example 5.8.5: Consider the following contaminated normal distribution: 75% of the time an observation is generated by a standard normal distribution while 25% of the time it is generated by a normal distribution with mean 0 and standard deviation 25. We call this distribution contaminated normal distribution A. Suppose a random sample of size 20 is generated from contaminated normal distribution A. The mean of [pic] is 0 so [pic]is true.

What is the true size of using the nominal size 0.05 t-test (reject the null hypothesis when [pic] which would have size 0.05 for a normal distribution) for random samples of size 20 contaminated normal distribution A? Let [pic]denote the density of the contaminated normal distribution A and let [pic].

The true size of the t-test for contaminated normal distribution A is

[pic] (1)

where [pic]=0 if [pic]and 0 otherwise. We can write (1) as

[pic]

where the expectation is with respect to random samples from contaminated normal distribution A.

The Monte Carlo method:

Consider a function [pic]of a random vector [pic]where [pic]has density [pic]. Consider the expected value of [pic]:

[pic].

Suppose we take an iid random samples [pic]from the density [pic].

Then by the law of large numbers

[pic]

The Monte Carlo method is to estimate [pic] by [pic]

Standard error of the estimate is [pic]

By the Central Limit Theorem, an approximate 95% confidence interval for [pic] is

[pic]

Example: Monte Carlo estimation of [pic]

Define the unit square as a square centered at (0.5,0.5) with sides of length 1 and the unit circle as the circle centered at the origin with a radius of length 1. The ratio of the area of the unit circle that lies in the first quadrant to the area of the unit square is [pic].

Let [pic]and [pic]be iid uniform (0,1) random variables. Let [pic]=1 if [pic]is in the unit circle and 0 otherwise. Then [pic].

Monte Carlo method: Repeat the experiment of drawing [pic]and [pic]be iid uniform (0,1) random variables n times and estimate [pic]by [pic]

In R, the command runif(n) draws n iid uniform (0,1) random variables.

Here is a function for estimating pi

piest=function(n){

#

# Obtains the estimate of pi and its standard

# error for the simulation discussed in Example 5.8.1

#

# n is the number of simulations

#

u1=runif(n);

u2=runif(n);

cnt=rep(0,n);

chk=u1^2+u2^2-1;

cnt[chktc){

ic=ic+1;

}

empalp=ic/nsims;

err=1.96*sqrt((empalp*(1-empalp))/nsims);

list(empiricalalpha=empalp,error=err);

}

Generating random observations with given cdf F

Theorem 5.8.1: Suppose the random variable U has a uniform (0,1) distribution. Let F be the cdf of a random variable with a continuous distribution function. Then the random variable [pic]has cdf F.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download