Sampling Distributions - CUHKMAA



Sampling Distributions

Introduction

Distribution of Mean

• Finite Population

Chi-Square Distribution

T Distribution

F Distribution

Order Statistics

Introduction

• Statistics concerns itself mainly with conclusions and predictions resulting from chance outcomes that occur in carefully planned experiments or investigations.

o These chance outcomes constitute a subset, or sample, of measurements or observations from a larger set of values called the population.

o In the continuous case they are usually values of identically distributed random variables, whose distribution referred to as the population distribution, or the infinite population sampled.

• If X1, X2... Xn are independent and identically distributed random variables, we say that they constitute a random sample from the infinite population given by their common distribution.

o If [pic] is the value of the joint distribution of such a set of random variable at [pic], then [pic].

• Statistical inferences are usually based on statistics, that is, on random variables that are functions of a set of random variables X1, X2... Xn constituting a random sample. Typical of these “statistics” are:

o If X1, X2... Xn constitute a random sample, then

is called the sample mean and

is called the sample variance.

Distribution of Mean

• Since statistics are themselves random variables, their values will vary from sample to sample, and it is customary to refer to their distributions as sampling distributions.

• Sampling distribution of the mean: If X1, X2... Xn constitute a random sample from an infinite population with the mean[pic]and the variance [pic], then [pic] and [pic].

o [pic]

o [pic]

o It is customary to write [pic] as [pic] and [pic] as [pic], and refer to [pic] as the standard error of the mean.

• Central limit theorem: If X1, X2... Xn constitute a random sample from an infinite population with the mean[pic]and the variance [pic], and the moment-generating function [pic], then the limiting distribution of

as [pic], is the standard normal distribution.

o Sometimes, the central limit theorem is interpreted incorrectly as implying that the distribution of [pic] approaches a normal distribution when [pic]. This is incorrect because [pic] when [pic].

o On the other hand, the central limit theorem does justify approximating the distribution of [pic] with a normal distribution having the mean [pic] and the variance [pic] when n is large. (i.e. [pic] regardless of the actual shape of the population sampled).

o If the population in normal, the distribution of [pic] is a normal distribution regardless of the size of n: If [pic] is the mean of a random sample of size n from a normal population with the mean [pic] and the variance [pic], its sampling distribution is a normal distribution with the mean [pic] and the variance [pic].

Distribution of Mean: Finite Population (without replacement)

• If an experiment consists of selecting one or more values from a finite set of numbers [pic], this set is referred to as a finite population of size N.

• If X1 is the first value drawn form a finite population of size N, X2 is the second value drawn... Xn is the nth value drawn, and the joint probability distribution of these n random variables is given by

for each ordered n-tuple of values of these random variables, then X1, X2... Xn are said to constitute a random sample from the given finite population.

o The probability for each subset of n of the N elements of the finite population (regardless of the order) is

This is often given as alternative definition or as a criteria for the selection of a random sample of size n from a finite population of size N: Each of the [pic] possible samples must have the same probability.

• The marginal distribution of Xr is given by[pic] for [pic], for [pic].

• The mean and the variance of the finite population [pic] are [pic] and [pic]

• The joint marginal distribution of any two of the random variables X1, X2, ..., Xn is given by [pic] for each ordered pair of elements of the finite population.

• If Xr and Xs are the rth and sth random variables of a random sample of size n drawn from the finite population [pic], then [pic]

• If [pic] is the mean of a random sample of size n from a finite population of size N with the mean[pic]and the variance [pic], then [pic] and [pic].

o Note that [pic] differs from that of a infinite population only by the finite population correction factor [pic]. When N is large compared to n, the difference between the two formulas for [pic] is usually negligible.

The Chi-Square Distribution

If X has the standard normal distribution, then [pic] has the chi-square distribution with [pic] degree of freedom. Thus, chi-square distribution plays an important role in problems of sampling from normal distribution.

• A random variable X has a chi-square distribution, and it is referred to as a chi-square random variable, if and only if its probability density is given by

for x > 0.

o The mean and variance of the chi-square distribution are given by [pic] and [pic], and its moment-generating function is given by [pic].

• If X1, X2... Xn are independent random variables having standard normal distributions, then [pic] has the chi-square distribution with [pic] degree of freedom.

• If X1, X2... Xn are independent random variables having chi-square distributions, with [pic] degrees of freedom, then [pic] has the chi-square distribution with [pic] degrees of freedom.

• If X1 and X2 are independent random variables, X1 has a chi-square distribution with [pic] degrees of freedom, and X1 + X2 has a chi-square distribution with [pic] degrees of freedom, then X2 has a chi-square distribution with [pic]degrees of freedom.

• If [pic] and [pic] are the mean and the variance of a random sample of size n from a normal population with the mean[pic]and the variance [pic], then

1. [pic] and [pic] are independent;

2. the random variable [pic] has a chi-square distribution with n - 1 degrees of freedom.

o Proof for part 2: Begin with the identity

[pic]

then divide each term by [pic] and substitute [pic] for [pic],

[pic]

The term on the left-hand side is a random variable having a chi-square distribution with n degrees of freedom, the second term on the right-hand side is a random variable having a chi-square distribution with 1 degree of freedom, and since [pic] and S are independent (without proof here), it follows that the first term on the right-hand side is a random variable with n – 1 degree of freedom.

• When the degree of freedom is greater than 30, the probabilities related to chi-square distributions are usually approximated with normal distributions: If X is a random variable having a chi-square distribution with ν degrees of freedom and ν is large, the distribution of [pic], and alternatively [pic], can be approximated with the standard normal distribution.

The t Distribution

In realistic applications the population standard deviation [pic] is unknown. This makes it necessary to replace [pic] with an estimate, usually with the sample standard deviation S.

• If Y and Z are independent random variables, Y has a chi-square distribution with [pic]degrees of freedom, and Z has the standard normal distribution, then the distribution of

is given by

for [pic], and is called the t distribution with [pic]degrees of freedom.

• If [pic] and [pic] are the mean and the variance of a random sample of size n from a normal population with the mean[pic]and the variance [pic], then

has the t distribution with n - 1 degrees of freedom.

o The random variables [pic] and [pic] have, respectively, a chi-square distribution with n – 1 degrees of freedom and the standard normal distribution. Since they are also independent, substitution into the formula for [pic] yields the result.

• When the degree of freedom is 30 or more, probabilities related to the t distribution are usually approximated with the use of normal distributions.

The F Distribution

• If U and V are independent random variables having chi-square distribution with [pic] and [pic] degrees of freedom, then

is a random variable having an F distribution, namely, a random variable whose probability density is given by

for [pic] and [pic] elsewhere.

• If [pic] and [pic] are the variances of independent random samples of sizes [pic] and [pic] from normal distributions with the variances [pic] and [pic], then

is a random variable having an F distribution with [pic] and [pic] degrees of freedom.

o [pic] and [pic] are values of independent random variables having chi-square distributions with [pic] and [pic] degrees of freedom. Substitution of the values for U and V yields the result.

• Applications of the distribution arise in problems in which the variances of two normal populations are compared.

Order Statistics

• Consider a random sample of size n from an infinite population with a continuous density and suppose we arrange the values of X1, X2... Xn according to size. If we look upon the smallest of the x's as a value of the random variable Y1, the next largest as a value of the random variable Y2, the next largest after that as a value of the random variable Y3, ..., and the largest as a value of the random variable Yn, we refer to these Y's as order statistics.

o In particular, Y1 is the first order statistic, Y2 is the second order statistic and so on. (As the population in infinite and continuous, the probability that any two of the x's will be alike is zero).

• For random samples of size n from an infinite population which has the value f(x) at x, the probability density of the rth order statistic Yr is given by

for [pic].

• For large n, the sampling distribution of the median for random samples of size [pic] is approximately normal with the mean [pic] and the variance [pic].

-----------------------

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download