Data Analysis 1016-319



Chapter 14: Inference for Tables

14.1: Chi-square Goodness-of-fit

According to M&M’, Milk Chocolate M&M’s are manufactured in the following percentages:

|Brown |Yellow |Red |Blue |Orange |Green |

|13% |14% |13% |24% |20% |16% |

Form groups of 3 or 4 students to work together…

Get a bag of M&M’s for your group from the instructor.

Count the number of M&M’s of each color in your bag.

• Determine the total number of M&Ms in your bag.

|Color |Brown |Yellow |Red |Blue |Orange |Green |Total |

|# M&M’s | | | | | | | |

Does your sample provide sufficient evidence to conclude that the color percentages given by M&M’ are incorrect?

STATE

▪ Determine the population type

▪ Describe (in words) the population characteristics (the p’s)

▪ State H0 and Ha (using the p’s)

▪ Set a reasonable level for (

PLAN

▪ Write the formula of the test statistic

▪ Describe the sample (fill in the “observed” row in the table below)

▪ Check that the sample meets the necessary assumptions

o Compute the expected counts (write your results in the table below)

o Verify that each of the expected counts is ( 5

|Color |Brown |Yellow |Red |Blue |Orange |Green |Total |

|Observed # M&M’s | | | | | | | |

|Expected # M&M’s | | | | | | | |

DO

▪ Compute the value of the test statistic using the formula from part B

▪ Compute the p-value

CONCLUDE

▪ Reject H0 OR Fail to Reject H0

▪ Make a concluding statement

Solution to Goodness-of-Fit Test

A. Categorical Population (categories = Brown, Yellow, Red, Blue, Orange, Green)

(1 = proportion of Milk Chocolate M&M’s that are Brown

(2 = proportion of Milk Chocolate M&M’s that are Yellow

(3 = proportion of Milk Chocolate M&M’s that are Red

(4 = proportion of Milk Chocolate M&M’s that are Blue

(5 = proportion of Milk Chocolate M&M’s that are Orange

(6 = proportion of Milk Chocolate M&M’s that are Green

H0: (1 = 0.13, (2 = 0.14, (3 = 0.13, (4 = 0.24, (5 = 0.20, (6 = 0.16

Ha: H0 is not true, so at least one color proportion differs from its hypothesized value

B. ( = 0.05,[pic]

C. Cell Counts – each group will have different results…

Example results are shown below:

|Color |Brown |Yellow |Red |Blue |Orange |Green |Total |

|Observed # M&M’s |5 |11 |6 |8 |9 |14 |53 |

|Expected # M&M’s |6.89 |7.42 |6.89 |12.72 |10.60 |8.48 |53 |

All expected counts are ( 5, so the sample size is large

D. [pic]= 7.95, p-value = 0.159

E. Is p-value ( (? NO, so we FAIL to REJECT H0

The data does not provide sufficient evidence to conclude that the color percentages given by M&M’ are incorrect.

Section 14.2: Test of Homogeneity

There are several different types of M&M’s, including Milk Chocolate, Peanut, Peanut Butter, Almond, Crispy, etc. They all come in the same six colors – Brown, Yellow, Red, Blue, Orange, and Green. But are the percentages of each color the same for different types of M&M’s?

Split the class into groups. Each group will collect data from a different M&M population.

Each group will be given a bag of M&M’s from the instructor.

Count the number of M&M’s of each color in your bag.

• Determine the total number of M&Ms in your bag.

• Write your results on the board.

|Type of M&M’s: ____________________________ |

|Color |Brown |Yellow |Red |Blue |Orange |Green |Total |

|# M&M’s | | | | | | | |

Do these samples provide sufficient evidence to conclude that the color percentages are different for the different types of M&M’s?

STATE

▪ Determine the population type

▪ Describe the different populations

▪ State H0 and Ha

▪ Set a reasonable level for (

PLAN

▪ Write the formula of the test statistic

▪ Describe the samples (copy the observed values from the board into the upper portions of the cells in the table).

▪ Check that the sample meets the necessary assumptions.

o Compute the expected counts (write your results in the lower portions of the cells in the table)

o Verify that each of the expected counts is ( 5

|Type of M&M/Color |Brown |Yellow |Red |Blue |Orange |Green |Row Total |

|Milk Chocolate | | | | | | | |

| | | | | | | | |

|Peanut | | | | | | | |

| | | | | | | | |

|Peanut Butter | | | | | | | |

| | | | | | | | |

|Crispy | | | | | | | |

| | | | | | | | |

|Column Total | | | | | | | |

DO

▪ Compute the value of the test statistic using the formula from part B

▪ Compute the p-value.

CONCLUDE

▪ Reject H0 OR Fail to Reject H0

▪ Make a concluding statement

Solution to Test of Homogeneity

NOTE: This test can be done with just Milk Chocolate and Peanut. The example below shows a comparison of Milk Chocolate, Peanut, and Peanut Butter M&M’s.

HINT: For the larger size M&M’s (such as Peanut and Peanut Butter), use two 1.7 ounce bags of M&M’s as your sample. (For the smaller size M&M’s, one bag is sufficient.)

A. Categorical Populations (categories = Brown, Yellow, Red, Blue, Orange, Green)

Populations = Types of M&M’s = Milk Chocolate, Peanut, and Peanut Butter

H0: the color proportions are the same for Milk Chocolate, Peanut, and Peanut Butter M&M’s

Ha: the color proportions are NOT the same for Milk Chocolate, Peanut, and Peanut Butter M&M’s

B. ( = 0.05,[pic]

C. Cell Counts – each class will have different results…

Example results are shown below:

|Type of M&M/Color |Brown |Yellow |Red |Blue |Orange |Green |Row Total |

|Milk Chocolate |5 |11 |6 |8 |9 |14 |53 |

| |8.95 |8.26 |9.64 |7.57 |7.57 |11.01 | |

|Peanut |7 |7 |10 |8 |4 |9 |45 |

| |7.60 |7.01 |8.18 |6.43 |6.43 |9.35 | |

|Peanut Butter |14 |6 |12 |6 |9 |9 |56 |

| |9.45 |8.73 |10.18 |8.00 |8.00 |11.64 | |

|Column Total |25 |24 |28 |22 |22 |32 |154 |

All expected counts are ( 5, so the sample sizes are large

D. [pic]= 11.48, df = 10, p-value = 0.322

E. Is p-value ( (? NO, so we FAIL TO REJECT H0

The data does not provide sufficient evidence to conclude that the color proportions are NOT the same for Milk Chocolate, Peanut, and Peanut Butter M&M’s.

14.2: Test of Independence

The National Highway Traffic Safety Administration published the Motor Vehicle Occupant Safety Survey in March 2000. Based on their survey of seat belt use, they categorized 7357 drivers by the type of vehicle they drive and how often they wear their seat belt. The table below is based on their findings.

|Vehicle Type/Response |All of |Most of |Some of |Rarely |Never |

| |the Time |the Time |the Time | | |

|Car |3985 |494 |198 |103 |61 |

|Van/Minivan |563 |67 |26 |6 |19 |

|Pickup |730 |189 |99 |43 |66 |

|SUV |567 |85 |28 |7 |21 |

Does the data provide sufficient evidence to conclude that there is an association between vehicle type and seat belt usage?

STATE

▪ Describe the variables that categorize the population

▪ State H0 and Ha

▪ Set a reasonable level for (

PLAN

▪ Write the formula of the test statistic

▪ Describe the sample (determine the observed counts and write them in the table).

▪ Enter the table into matrix [A] in your calculator.

o TI-83/84: Press 2nd[MATRIX], select [A] and press [Enter]. Set the row and column dimensions, then enter the observed values into the matrix as they appear in the table of counts.

o TI-89: Press APPS, find the Data/Matrix Editor, and select New. Set the type as Matrix, name the variable a, specify the row and column dimensions, then press Enter. Type the observed values into the matrix as they appear in the table of counts

▪ Check that the sample meets the necessary assumptions – this will be done in part D.

DO

▪ Use the (2Test (TI-83/84 STAT) or Chi2 2-way test (TI-89 Stats/List Editor ) to compute the test statistic and p-value (and matrix of expected counts).

o Specify the matrix of observed counts (A for TI-83/84, a for TI-89)

o Specify where to store the matrix of expected counts ([B] for TI-83/84, b for TI-89)

o Write the results below.

▪ View the matrix of expected counts and verify that each value is ( 5

CONCLUDE

▪ Reject H0 OR Fail to Reject H0

▪ Make a concluding statement

Solution to Test of Independence

A. Categorical Variables:

Vehicle Type (categories = Car, Van/Minivan, Pickup, SUV)

Seat Belt Usage (categories = All of the Time, Most of the Time, Some of the Time, Rarely, Never)

H0: Vehicle Type and Seat Belt Usage are independent

Ha: Vehicle Type and Seat Belt Usage are NOT independent

B. ( = 0.05,[pic]

C. (Entering data into calculator)

D. [pic]= 229.83, p-value = 2.16E–42, df = 12

Expected Counts

|Vehicle Type/Response |All of |Most of |Some of |Rarely |Never |

| |the Time |the Time |the Time | | |

|Car |3846.08 |549.44 |230.96 |104.62 |109.89 |

|Van/Minivan |541.04 |77.29 |32.49 |14.72 |15.46 |

|Pickup |895.38 |127.91 |53.77 |24.36 |25.58 |

|SUV |562.49 |80.36 |33.78 |15.30 |16.07 |

All expected counts are ( 5, so the sample size is large

E. Is p-value ( (? YES, so we REJECT H0

The data provides sufficient evidence that there is an association between vehicle type and seat belt usage.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download