Scatterplots and Regression Models



Scatterplots and Regression Models Name _________________________________

Curve Fitting Date ___________________ Period _________

1. The table below gives the number of drive-in movie screens in the United States for 1988 – 1999.

|Year |1988 |1989 |1990 |

|119 |Halil Mutlu |Turkey |633 |

|130 |Tang Ningsheng |China |678 |

|141 |Naim Sulemanoglu |Turkey |739 |

|154 |Zhan Xugang |China |787 |

|167.5 |Pablo Lara |Cuba |809 |

|183 |Pyrros Dimas |Greece |864 |

|200.5 |Alexi Petrov |Russia |886 |

|218 |Kakhi Kakhiasvili |Greece |926 |

|238 |Timur Taimazov |Russia |948 |

Source: Sports Illustrated Almanac, 1997

a. Enter the data into your calculator. Label your x-list “weight” and your y-list “lifted”. Make a scatterplot of the data.

b. What type of function do the points seem to represent?

c. Calculate a linear regression equation.

i. Write the equation of the model (round to the nearest thousandth).

ii. What is the value of r2?

iii. The r2 statistic describes how much of the variation in one variable can be accounted for by this straight-line relationship with another variable, with r2 = 1 meaning 100%. How can you interpret the r2 value this data?

d. Now calculate a Quadratic Regression.

i. Write the equation of the model (round to the nearest thousandth).

ii. What is the value of r2?

iii. The r2 statistic describes how much of the variation in one variable can be accounted for by this quadratic relationship with another variable, with r2 = 1 meaning 100%. How can you interpret the r2 value this data?

e. Which model is a better fit for this data? Why?

f. Superimpose the best regression model onto the scatterplot. Are the points close to the line/curve? Based on what you found in part iii, is this surprising?

g. Use this equation to predict the amount lifted for a person who weighs 35 pounds and a person who weighs 400 pounds. Do you think your model is a helpful predictor outside the range of weight limits given? Why or why not?

Although the data in #2 appeared to have a linear relationship, performing a different type of regression gave a better fitting curve. It is important to always perform at least two regression tests to decide which model is best. Some common curves are found below. Notice that on small intervals, these curves can look very similar. Knowing how these curves behave can help in choosing the best regression model.

2. The population per square mile in the United States has changed dramatically over a period of years. The table below shows the number of people per square miles (or population density) of the United States for several years between 1790 and 2000:

|Year |1790 |1800 |1810 |

|119 |Halil Mutlu |Turkey |633 |

|130 |Tang Ningsheng |China |678 |

|141 |Naim Sulemanoglu |Turkey |739 |

|154 |Zhan Xugang |China |787 |

|167.5 |Pablo Lara |Cuba |809 |

|183 |Pyrros Dimas |Greece |864 |

|200.5 |Alexi Petrov |Russia |886 |

|218 |Kakhi Kakhiasvili |Greece |926 |

|238 |Timur Taimazov |Russia |948 |

Source: Sports Illustrated Almanac, 1997

h. Enter the data into your calculator. Label your x-list “weight” and your y-list “lifted”. Make a scatterplot of the data.

i. What type of function do the points seem to represent? linear

j. Calculate a linear regression equation.

i. Write the equation of the model (round to the nearest thousandth). y = 2.617x + 356.848

ii. What is the value of r2? .952

iii. The r2 statistic describes how much of the variation in one variable can be accounted for by this straight-line relationship with another variable, with r2 = 1 meaning 100%. How can you interpret the r2 value this data? About 95% of the variation can be accounted for with this model.

k. Now calculate a Quadratic Regression.

i. Write the equation of the model (round to the nearest thousandth). y = -.016x2 + 8.412x – 132.984

ii. What is the value of r2? .994

iii. The r2 statistic describes how much of the variation in one variable can be accounted for by this quadratic relationship with another variable, with r2 = 1 meaning 100%. How can you interpret the r2 value this data? About 99% of the variation can be accounted for with this model.

l. Which model is a better fit for this data? Why? Quadratic model; r2 value is closer to 1.

m. Superimpose the best regression model onto the scatterplot. Are the points close to the line/curve? Based on what you found in part iii, is this surprising?

The points are close to the curve. This is not surprising, as the r2 value is close to 1.

n. Use this equation to predict the amount lifted for a person who weighs 20 pounds and a person who weighs 400 pounds. Do you think your model is a helpful predictor outside the range of weight limits given? Why or why not?

f(35) = 141.44 pounds f(400) = 619.86 Not a good predictor outside the range of weight limits.

3. The population per square mile in the United States has changed dramatically over a period of years. The table below shows the number of people per square miles (or population density) of the United States for several years between 1790 and 2000:

Year |1790 |1800 |1810 |1820 |1830 |1840 |1850 |1860 |1870 |1880 |1890 | |People per

Square mile |4.5 |6.1 |4.3 |5.5 |7.4 |9.8 |7.9 |10.6 |10.9 |14.2 |17.8 | |Year |1900 |1910 |1920 |1930 |1940 |1950 |1960 |1970 |1980 |1990 |2000 | |People per

Square mile |21.5 |26.0 |29.9 |34.7 |37.2 |42.6 |50.6 |57.5 |64.0 |70.3 |80.0 | |

Source: Northeast-Midwest Institute

a. Enter the data into your calculator. Label your x-list “weight” and your y-list “lifted”. Make a scatterplot of the data.

b. Calculate a regression of your choice.

i. What regression did you choose?

ii. Write the equation of this model (round to the nearest thousandth).

iii. What is the r2 value? Interpret this value.

c. Calculate a different regression.

i. Which regression did you choose this time?

ii. Write the equation of this model (round to the nearest thousandth).

iii. What is the r2 value? Interpret this value.

d. If you feel you need to calculate another regression, do so. Superimpose the best regression model onto the scatterplot. Write the best model below. Use this equation to predict the population density of the U.S. in 2030 and to predict the DATE when the population density of the U.S. will reach 200 people per square mile.

Using the Cubic Model:

y = 2.817x3 – .014x2 + 22.830x – 12332.887

f(2030) = 107.9 people per square mile

According to this model, the U.S. will reach a population density of 200 per square mile sometime on October 24, 2101.

-----------------------

Answers here will vary:

Linear Regression:

r2 = .894

y = .344x – 623.667

Quadratic Regression:

R2 = .997

y = .002x2 – 7.499x + 6798.699

Cubic Regression:

R2 = .998

y = 2.817x3 – .014x2 + 22.830x – 12332.887

Exponential Regression:

R2 = .992

y = 1.835(1.015)x

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download