Statistics



Statistics Name_______________________________ Date____________

Final Exam Study Guide 0607

The final exam for this course will be based on Chapters 1 -9, 12, 19 of the Stats textbook. It is important you review the study guides and quizzes for all of these topics. Reviewing homework problems is also a good strategy. Remember this study guide is intended to help you know what to study but should not be used exclusively.

1. Enter the following two sets of data into your calculator. Using STAT/EDIT, put the “TD” data in L1 and the “Interceptions” data in L2.

a. Calculate the median for the TD data(STAT/CALC/1) ___________

b. Calculate the five number summary for the TD data(STAT/CALC/1)

__________ __________ __________ __________ __________

__________ __________ __________ __________ __________

c. What is the mode of the “TD” data _____________

d. What is the interquartile range “IQR = Q3 – Q1” _____________

e. Calculate the lower outlier test limit (Q1 – 1.5IQR) _____________

f. List any outliers on the lower end ____________

g. Calculate the upper outlier test limit (Q3 + 1.5IQR) _____________

h. List any outliers on the upper end ____________

i. On your calculator, create a modified box plot for the Interception data and sketch it below (Make sure to set the window for XMIN and XMAX)

j. List any outliers _______________________

k. Describe the shape of the distribution (symmetric, skewed left, skewed right)

_______________________

l. On your calculator, create a histogram of TD data and sketch it below

(XMIN= 50, XMAX= 500, XSCL= 50, YMIN= 0, YMAX= 7, YSCL = 1)

m. Describe the modality of the data (unimodal, bimodal,etc.) ____________

n. Where is the center of the distribution? __________________

2. Empirical Rule: Copy the TD data into L3 (Put cursor on L3 and enter L1 [2nd 1]). Then sort the data in L3 in ascending order (STAT/2/2nd 3)

a. Calculate the mean for the TD data (STAT/CALC/1) ___________

b. Calculate the standard deviation for the TD data (STAT/CALC/1)_______

c. What percentage of the data falls within one standard deviation of the mean (use the sorted data in L3)

__________________

d. To follow the empirical rule, what percentage of the date should fall within one standard deviation of the mean

__________________

e. What percentage of the data falls within two standard deviations of the mean (use the sorted data in L3)

__________________

f. To follow the empirical rule, what percentage of the date should fall within two standard deviations of the mean

__________________

g. What percentage of the data falls within three standard deviations of the mean (use the sorted data in L3)

__________________

h. To follow the empirical rule, what percentage of the date should fall within three standard deviations of the mean

__________________

i. Does this data follow the empirical rule? Why or why not?

____________________________________________________________

j. What is the z-score for Joe Montana’s TD data __________________

3. Scatterplots and Regression: Clear the data in L3 (STAT/EDIT/Cursor on L3/Clear). The TD data in L1 are the x-values and the Interception data in L2 are the y-values (actual). This section involves using your calculator hint sheet.

a. What is the regression equation, y = ax +b (STAT/CALC/4)___________

b. What is the slope of the regression equation _________________

c. What is the y-intercept of the regression equation _________________

d. What is the correlation coefficient, r, for this data _________________

e. What is the strength of the correlation _________________

f. What is the proportion of variability of this data _______________

g. Is the regression line going to be good, fair or poor for predicting values not included in the data

_______________

h. Put the predicted y-values (interceptions) in L3. Put cursor on L3 and enter a*L1 + b using your calculated a and b values.

i. Put the residual values in L4. Put the cursor on L4 and enter L2-L3

[Remember the Residual equals Actual-y minus Predicted-y

j. What is the predicted number of interceptions for Joe Montana _________

k. Based on his actual number of interceptions, 216, what is the

residual value for Joe Montana ____________

l. Using the calculator hint sheet, create a scatterplot using your calculator and sketch it below. Make sure the regression line shows up on your scatterplot. Use Zoom 9 to create the window

m. Using the least squares regression line or your calculator,

predict the number of interceptions given 104 TD’s ___________

n. Using the least squares regression line or your calculator,

predict the number of TD’s given 75 interceptions ___________

o. Which quarterback has the highest residual value

in absolute value ____________________

p. Is terms of regression, is this person an outlier or influential observation

___________________________________

q. Fran Tarkenton has the most extreme TD or x-value. Eliminate Tarkenton from the data. What is the regression line without Tarkenton (STAT/CALC/4)

_____________________________

r. Redo the scatterplot with the new regression line in Y2. With respect to regression, what is Tarkenton considered and why?

___________________________________________________________

4. Normal Distributions and the Bell Curve

• normalcdf ( low, high): second/vars/2

• invNorm(percent below): second/vars/3

a. Given the following, calculate the z-score

(SAT score 580, mean 600, standard deviation 100) _________

b. What percent of a normal model is found in each region? Draw a picture to help you with the answers.

i. [pic] __________________

ii. [pic] __________________

c. In a Standard Normal Model, what value(s) of z cut(s) off the region described. Draw a picture to help.

i. The highest 20% _______________

ii. The lowest 55% _______________

5. The following table is a list of populations (in millions) of the ten largest cities in the world and the predicted populations for the year 2000. Use the data to make a stem-and-leaf plot.

|City/ |Tokyo |Mexico City |Sao Paulo |

|Year | | | |

|Yes |15 |18 |33 |

|No |25 |12 |37 |

|Total |40 |30 |70 |

a. Whether a student read the book or not is what type of variable:

__________________

b. Calculate the proportion of the students that are going to

the prom (two decimal places): ___________

c. Calculate the proportion of the students that are not going to

the prom (two decimal places): ___________

d. Sketch a marginal distribution for this data below:

e. For only the males, calculate the proportion that are going

to the prom ____________

f. For only the males, calculate the proportion that are not going

to the prom ____________

g. For only the females, calculate the proportion that are going

to the prom ____________

h. For only the females, calculate the proportion that are not going

to the prom ____________

i. Sketch a conditional distribution for this data below:

6. In the 1980’s, there were two major studies about women’s attitudes toward relationships, love and sex. In the first study, Shere Hite distributed questionnaires through women’s groups. Of the 4500 women who returned the questionnaires, 96% said they give more emotional support then their mates than they get. In the second study, an ABC News Washington Post poll surveyed a random sample of 767 women finding that 44% claimed to give more emotional support than they receive.

a. Comment on whether or not you think either one of these studies’ sampling methods may be biased. If so, why?

b. Do you think the survey sizes would have any impact on the results?

c. Which study do you think would be more representative of women’s true feelings? Why?

d. In the Hite study, what is the parameter and what is the statistic?

e. In the ABC study, what is the parameter and what is the statistic?

7. A theoretical result states that the sample proportion will equal the population proportion ([pic]). In a Time/CNN random telephone poll of 1049 American adults taken on March 25, 2006, 54% of the voters stated that they would vote for Senator Hillary Clinton of New York over Senator Barack Obama of Illinois.

a. In this sample what is [pic]? ________________________

b. If the theoretical result holds true in this situation, what would the proportion of American adults in the population that would vote for Clinton?

________________________

c. What is the standard deviation for this sample? __________________

d. Using the 95% confidence interval, calculate the range of percentages of voters that would be expected to vote for Clinton. (Remember that a 95% confidence interbal means that 95% of the votes would fall between [pic] and [pic] standard deviations)

______________________________

Statistics: Final Exam Review Topics

In addition to the review packet and these topics, you should review all the quizzes and labs we did during this course

1. Five Number Summary: Min, Q1, Median, Q3, Max

2. Mean: average

3. Range/Spread: Max - Min

4. IQR: Q3 – Q1

5. Outliers/outlier test: Q1 – 1.5IQR; Q3+1.5IQR

6. Modified box plot on calculator (window): use calculator hint sheet

7. Shape of data: mound, skewed left, skewed right

8. Empirical rule

a. 68% between is [pic]

b. 95% between is [pic]

c. 99% between is [pic]

9. z-score [pic]

10. Side-by-side stem plot

11. Scatterplot on calculator (window)

12. Strength and direction of association

13. Correlation coefficient: r

14. Least squares regression line (prediction equation ax + b – on calculator)

a. Use calculator hint sheet

15. Proportion of variability: r2

16. Doing predictions using least squares regression line

a. Use calculator hint sheet

17. Residual value = actual value – fitted value

18. Outlier: For purposes of regression, the point with the largest residual is an outlier

19. Influential observation: the point with an extreme x-value that changes regression line

20. Sampling: biased vs. random

21. 95% confidence interval calculation when sampling

a. [pic] is proportion as estimated by[pic]

b. Sx = [pic]is standard deviation, n is number in sample

c. 95% confidence interval is between[pic]

22. normalcdf ( low, high): second/vars/2

23. invNorm(percent below): second/vars/3

24. Causation: lurking variable is reason two unrelated variables are strongly associated

Stats calculator “hint” sheet for Scatterplots

Anything in bold is a setup command. Follow these steps:

1. Turn calculator ON

2. 2nd Y = 4 PLOTSOFF (enter) (it should then say DONE)

3. Y = (clear all equations)

4. Stat – 5 SetupEditor (enter) (it should say DONE)

5. Stat – 1 Edit (clear all lists L1, L2, L3, L4 etc by moving cursor to top of each list, hit clear button and hit enter)

6. Stat – 1 Edit (Type in data from the problem into lists)

7. Window (change window so it is appropriate for data)

8. 2nd Y = 1 (Turn it ON by hitting ENTER) (highlight first scatter graph and make sure L1 and L2 are used for Xlist and Ylist)

9. Graph

10. Stat – Calc – 4 LIN REG – enter

11. Y =

12. VARS – 5 Statistics – EQ – 1RegEQ

13. Graph

To locate a “Y” value given “X”, use the following commands:

14. 2nd TRACE – 1 value – (enter in appropriate value)

Note: Window may need to be adjusted

To locate an “X” value given “Y”, use the following commands:

15. Go to Y =

on Y2 = (type in y-value)

Graph

Note: Window may need to be adjusted

2nd TRACE – 5 Intersect (Enter Enter Enter) (yes, hit enter 3 times)

[pic]

-----------------------

266

462

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download