Andrew Boutros



2.2 Normal Distributions (Bell Curve)

In many natural processes, random variation conforms to a particular probability distribution known as the normal distribution, which is the most commonly observed probability distributions. The normal curve was first used in the 1700’s by French mathematicians and early 1800’s by German mathematician and physicist Karl Gauss. The curve is known as the Gaussian distribution and is also sometimes called a bell curve.

Normal curves

← Curves that are symmetric, single-peaked, and bell-shaped. They are used to describe normal distributions.

← The mean is at the center of the curve.

← The standard deviation controls the spread of the curve.

← The bigger the St Dev, the wider the curve.

← There are roughly 6 widths of standard deviation in a normal curve, 3 on one side of center and 3 on the other side.

all have the same overall shape described by mean (μ) and standard deviation (σ).

Empirical Rule (68/95/99.7 Rule)

68% of observations are within 1 σ of μ (approx.!!! Really .6827)

95% of observations are within 2 σ of μ

99.7% of observations are within 3 σ of μ The questions about “area”, “percent”, “relative frequency” are answered.

[pic] [pic][pic][pic][pic] [pic][pic][pic]

EXAMPLE 1: The distribution of the heights of women is normal with mean of 64.5 and a standard deviation of 2.5. What percent of women are in the following ranges? :

1) P(x < 64.5) = 2) P(x < 69.5) = 3) P(x > 62) =

4) P(x > 57) = 5) P(57 < x < 67) = 6) P(59.5 < x 68) on N(64.5, 2.5) Draw and label a normal curve

Step two – standardize x and label picture with z-score

z =

Step three – find the probability by using Table A, and the fact that the total area is equal to 1.

[pic]

Step Four: Write a conclusion:

The proportion of young women that are_______________ than _____ inches is approximately ____________.

Use Table A to convert the following z-scores to probability. Draw a picture!!

1) P(z < 2.3) = 2) P(z < -1.52) = 3) P(z > -0.43) =

4) P(z > 3.1) = 5) P(-1.52 < z < 2.3) = 6) P(-3 < z < 3) =

Example of whole process:

A man’s wife is pregnant and due in 100 days. The corresponding probability density distribution function for having a child is approximately normal with mean 100 and standard deviation 8. The man has a business trip and will return in 85 days and have to

go on another business trip in 107 days.

What is the probability that the birth will occur before his second trip?

1. Told births follow an approximately normal distribution.

2. Want:

3. Compute:

Now have: 107

Table gives:

Or

on calculator: normalcdf ( –10000,107,100,8) gives _____________

4. There is about an ________ chance that the baby will be born before the second

business trip.

EXAMPLE #3:

For 14 year old boys, cholesterol levels are ~ N(170 , 30).

a) What percent of boys have a level of 240 or more?

b) What percent of boys are between 170 and 240?

Normal Distribution Calculations

Process:

1) Normality—table is for normal distributions (or at least approximately normal distributions only.)

2) state in terms of x and draw the curve. Label with µ,σ, x

3) standardize with new graph (turn x into z).

4) use table A or calculator: normalcdf (lowerbound, upperbound, µ, σ)

5) answer the question (remember if the distributions is approximately normal, you have an approximate probability).

HW p 118 1 – 4, p 121 6 – 8 (not 7d)

Finding a Data Value from a z-score:

Z = x =

He is able to cancel his second business trip, and his boss tells him that he can return home from his first trip so that there is a ________ chance that he will make it back for the birth. When must he return home?

1. Told that distribution of births is approximately normal

Hint: use table backwards (on calculator use invNorm ( area, mean, SD])

2. We are given the probability and we want the raw score (day to return). First, remember that if there is a __________ chance that he will make it on time, then there is a _______ chance that he will not (table gives only values “less than”).

Probability statement: P(X < ? ) = _________

Use the table in reverse—find a z-score that gives .01 as the probability.

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

-2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064

-2.3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084

-2.2 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110

Search for the probability value that is closest to __________ and find __________ and

__________. Since __________ is closer to _________, use this value.

The corresponding z-score is -2.33. Now find the x that produces this z.

3. Have: -2.33 ’ x - 100

8

x =

or

on calculator: invNorm( )

4. He must return from his business trip in _____days.

Note: All four steps must still be shown. The calculator is only replacing the z-calculation. x

EXAMPLE #4:

SAT-V scores are ~ N(505,110)

1) How high must a student score to be the 30th percentile?

2) How high must a student score to get in the top 10%?

3) What scores contain the middle 50% of scores?

HW p 142 29 – 30, p 147 31 – 36 (not 32b)

Assessing Normality

The normal table (and normalcdf) _________________________________________________ because if the distribution is not approximately normal, the probabilities will be wrong. Sometimes we are told we have a normal distribution. Sometimes we are given data and can use histograms (dotplots or stemplots) to check for normality. It often easier to use normal probability plots and look for linearity—________________________________________________________________________________________.

Normal probability plots give a visual way to determine if a distribution is approximately normal. These plots are produced by

doing the following:

1. The data are arranged from smallest to largest.

2. The percentile of each data value is determined.

3. From these percentiles, normal calculations are done to determine their corresponding z-scores.

4. Each z-score is plotted against its corresponding data value.

If the distribution is close to normal, the plotted points will lie ____________________________________.

Systematic deviations from a line indicate a non-normal distribution. In the first example below, candy bar weights, an approximate normal distribution is shown.

Weights of Mounds Candy Bars

Computer output of a normal probability plot shows lines as boundaries—if the data falls within the lines, it is approximately normal.

In this example, the histogram and the normal probability plot both show that this data is not approximately normal.

Assessing Normality

Method 1: 1) make a histogram or stemplot to check for big outliers, skews, gaps, etc...

2) calculate x + s, use 68/95/99.7 rule to see if it is normal.

Method 2: 1) make a normal probability plot (also-normal quantile plot)

You have a plot z vs. x

2) if the plot is close to a line, it is close to normal.

Using the calculator: STATPLOT, bottom right graph

right skew: largest observations are above a line drawn through the body of the data.

left skew: smallest are below the line.

EXAMPLE #5:

Is the following data normally distributed? Use both methods to check:

550 561 488 507 526 555 536 529 558 565

557 553 562 529 544 534 579 510 527 539

542 547 563 534 546 530 575 568 585 550

-----------------------

13

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download