Goal 1: To help practicing teachers get a fundamental ...

[Pages:1]

Goal 1: To help practicing teachers get a fundamental understanding of t-tests and p-value in a short amount of time and apply it to biology data they obtain, without “too much math.”

Goal 2: Learn to use TI-84 Graphing Calculator to do statistical analysis and find p-values and understand what the results mean.

Goal for Day 2:

Develop an intuitive understanding of the binomial and the normal probability distributions, de-emphasizing math formulas;

Relate normal distributions to teaching and the biology experiment;

Get comfortable with characteristics of normal distribution, such as spread (in St. Dev.), and probability is an area.

Illustrate calculator usage for finding probabilities of probability

distributions.

Develop understanding of what statistically significant means in

terms of probability.

Illustration of Probability Simulator on the TI-84 using the TI-Smart View software

Question:

Is it better to use actual manipulatives (like spinners and dice) with students, or software simulations? Does the age of the student matter?

Recall:

A Probability Distribution is a list of the outcomes from an experiment along with their respective probabilities.

|Experiment: toss 2 coins | | |

| |Heads |P(H) |

|Probability Distribution: |0 |1/4 |

| |1 |1/2 |

| |2 |1/4 |

The only rules for a probability distribution is that each probability must be a number between 0 and 1, and the sum of all the probabilities must add to 1 (or 100%).

Probability distributions may result from discrete or continuous data. We will look at a discrete probability distribution - the binomial distribution. Then we will look at a continuous probability distribution - the normal distribution.

Probability Distributions

Discrete Continuous

* Binomial * Normal

Geometric Chi-Square

Poisson

What is discrete data? What is continuous data?

discrete before continuous

Discrete = counts

Continuous = measurements

Discrete or continuous?

1. heights of third-graders in your class

2. number of students in each grade

3. number of siblings for each child

4. weights of students

5. teacher’s salary schedule

spinners, coins, dice = discrete data

All of the examples of probability distributions from Day 1 were discrete.

When an experiment (with discrete data) has exactly two outcomes, the probability is constant for each trial, and there are a fixed number of trials, the experiment is called binomial.

Tossing coins is a (discrete) binomial experiment.

1st toss 2nd toss

H

H

T

H

T

T

Notice that binomial experiments have two branches for each trial.

|Experiment: toss 2 coins | | |

| |Heads |P(H) |

|Probability Distribution: |0 |1/4 |

| |1 |1/2 |

| |2 |1/4 |

Bar graph for probability distribution

| | | |

| | | |

| | | |

|0 |1 |2 |

The notion that the sum of the areas of the bars in a probability distribution is equal to 100% is a key concept in statistics!

The experiment of rolling 2 dice is NOT binomial, and you easily see that there are more than two branches in the tree diagram.

Ex: Roll 2 dice

1st die 2nd die

1

2

3

4

5

6

|Experiment: roll 2 dice |sum |P(sum) |

|Probability Distribution: |1 |0 |

| |2 |1/36 |

| |3 |2/36 |

| |4 |3/36 |

| |5 |4/36 |

| |6 |5/36 |

| |7 |6/36 |

| |8 |5/36 |

| |9 |4/36 |

| |10 |3/36 |

| |11 |2/36 |

| |12 |1/36 |

| |13 |0 |

However, it is a probability distribution, and here’s the bar graph for rolling two dice. The sum of the areas of the bars = 100%.

| | |

|0 | |

|1 | |

|2 | |

|3 | |

|4 | |

|5 | |

How do we find the probabilities for the table?

We could use tree diagrams . . . very messy . . .

item 1 2 3 4 5

or we could use rules of probability to find probabilities . . . . very tedious . . .

Ex: Prob(exactly 1 right) = P(RWWWW) or P(WRWWW) or P(WWRWW) or P(WWWRW) or P(WWWWR) =

¼ • ¾ • ¾ • ¾ • ¾ + ¼ • ¾ • ¾ • ¾ • ¾ + ¼ • ¾ • ¾ • ¾ • ¾ + ¼ • ¾ • ¾ • ¾ • ¾ + ¼ • ¾ • ¾ • ¾ • ¾ =

And we’d have to do this for each entry in the table!!

Note: the binomial formula is a generalization of the rules of probability:

P(x) = . px . (1– p)n-x

Another, easier alternative is to let the TI-84 calculator “do the math.”

DIST –> binompdf(n, p, x)

|# right |Prob(# right) |

|0 | |

|1 | |

|2 | |

|3 | |

|4 | |

|5 | |

And the graph of the distribution might look like this:

| | | | | | |

| | | | | | |

| | | | | | |

| | | | | | |

| | | | | | |

|0 |1 |2 |3 |4 |5 |

But the point here is NOT to get bogged down in calculator distributions (although you may want to explore them further on your own), and not to get bogged down in the math, but rather to understand this:

The graphs of probability distributions show 100% of the probability, which is equivalent to the sum of the areas of the bars. The area of each bar represents the probability for the particular value of the random variable.

With this main idea in mind, let’s move on to a continuous probability distribution, the normal distribution.

NORMAL DISTRIBUTION

Characteristics:

1. Continuous data (or data treated as continuous)

2. Symmetric

3. Virtual range (99.8% of data) within 3 standard deviations each side of mean (a spread of 6 st. dev.)

4. Total area under the curve = 100%

5. Probability = area under curve between 2 data values

Notation:

[pic] = mean

s = standard deviation

With continuous data, we can’t make a table as we did for discrete data. Instead, we draw the graph of the distribution, but instead of a bar graph, we use a frequency curve.

We are going to do two problems, just to give you a sense of how probability is related to the normal distribution.

Heights of women are normally distributed with a mean of 63.6 in. and a standard deviation of 2.5 in. (based on information from the National Health Survey). The U.S. Army requires women's heights to be between 58 in. and 80 in. Find the percentage of women meeting that height requirement. Are many women being denied the opportunity to join the Army because they are too short or too tall?

To solve the problem, we need to know the area (which is the probability) between 58 and 80 under the curve.

In calculus-based statistics, you integrate to find the area between two points under a curve. CALCULUS!! Argh!!

We are going to let the TI-84 calculator “do the math.”

DISTR –>normalcdf (lower, upper, [pic], s)

Cans of regular Coke are labeled as containing 12 oz. The contents are normally distributed with a mean of 12.19 oz. and a standard deviation of 0.11 oz. (based on information from the Coca-Cola Co.). What percentage of cans contain less than the 12 oz. Printed on the label?

For this problem, the question of interest is: Are many consumers being cheated?

The Normal Probability Distribution helps us decide what results are typical, and what results are unusual, i.e.,

If you knew the heights of plants are normally distributed (mean 13, st. dev. 2), then how unusual would it be to get a plant that grew 22 inches tall?

Now, a little twist on our thinking . . . what if you are told the probability of winning Lotto is 1/13000000, and yet you buy a ticket and . . . YOU WIN?

Is it significant?

In Statistics, we use the word significant to describe getting a result or outcome that we didn’t expect to get (because the probability was so small).

So a plant that grows to 22 inches, when the heights of plants are normally distributed with a mean 13 and a st. dev. 2, is statistically significant because the probability of getting such a plant is low.

Connected Math – page 8

Quality ratings for natural and regular peanut butter

If you make a decision based on looking at graphs, you can’t be sure if the differences you see are typical sampling fluctuation, or a significant difference in quality ratings. Statistics helps you find the probability, which helps you make an informed decision!

Suppose two teachers compare their student's ISAT scores, and see that one of the classes has lower scores than the other class.  Should the teacher whose class has lower scores be concerned? 

What if the two teachers were told that these results were just random sampling fluctuation, and that there was no statistically significant difference between the classes?

Or maybe a teacher notices that a many of her students are heavier than students in the other classes of the same grade.  Should she be worried about her class being overweight? 

What if she were told that the probability of having a class with this average weight was very unusual, i.e., statistically significant. Now should she worry?

Explain what it means to you now when you hear the words “statistically significant.”

OK, so the idea of knowing probabilities can help us make good decisions.

And if an outcome is statistically significant, that means you got a result which had a low probability of occurring.

And we have some understanding of the normal distribution and finding probabilities using it.

But now you might be thinking … not all data is normal. The plant data you will be collecting is not necessarily normal. So what’s the fuss about knowing the normal distribution????

- stay tuned for tomorrow’s lesson!

-----------------------

2-sample t-tests

p-value approach

Sampling distributions & CLT

Continuous probability distribution

Probability Fundamentals

0

1

1/2

Notice the sum of the areas of the bars = 1

2 3 4 5 6 7 8 9 10 11 12

R

R

R

R

R

R

R

R

R

R

R

R

R

R

R

W

W

W

W

W

W

W

W

W

W

W

W

W

W

W

R

W

R

W

R

W

R

W

R

W

R

W

R

W

R

W

R

W

R

W

R

W

R

W

R

W

R

W

R

W

R

W

1

2

3

4

5

6

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download