Welcome to Learning Resources and Technology Services ...



4.4 - Scatter Plots, Lines of best fit, Correlation (positive, negative, none, strong, weak)

Curriculum Outcomes:

A7 demonstrate an understanding of and apply proper use of discrete and continuous

number systems

C2 model real-world phenomena with linear, quadratic, exponential, and power equations, and linear inequalities

C3 gather data, plot the data using appropriate scales, and demonstrate an understanding of independent and dependent variables and of domain and range

C4 create and analyze scatter plots, using appropriate technology

C5 sketch graphs from words, tables, and collected data

C9 construct and analyze graphs and tables relating two variables

C11 write an inequality to describe a graph

C16 interpret solutions to equations based on context

C17 solve problems using graphing technology

C33 graph by constructing a table of values, by using graphing technology, and, when appropriate, by y-intercept slope method

F4 calculate various statistics, using appropriate technology, analyze and interpret displays, and describe relationships

F6 solve problems by modelling real-world phenomena

F8 determine and apply a line of best fit, using the least-squares method and median-median method, with and without technology, and describe the differences between the two methods

F9 demonstrate an intuitive understanding of correlation

F10 use interpolation, extrapolation, and equations to predict and solve problems

Scatter Plots

Often, when real-world data is plotted, the result is a linear pattern. The general direction of the data can be seen, but the data points do not all fall on a line. This type of graph is a scatter plot. A scatter plot is often used to investigate the relationship (if one exists) between two sets of data. The data is plotted on a graph such that one quantity is plotted on the x-axis and one quantity is plotted on the y-axis. If the relationship does exist between the two sets of data, it will be visible when the data is plotted.

Example 1: The following graph represents the relationship between the price of lobster per pound and the number of lobsters sold.

From the graph, the connection between the cost of the lobster and the number sold is obvious: as the price decreased the number of lobsters sold increased.

Example 2: The following scatter plot represents the sale of lottery tickets and the temperature.

From the graph, it is clearly seen that there is no relationship between the number of lottery tickets sold and the temperature of the surrounding environment.

Correlation

Correlation refers to the relationship or connection between two sets of data. The correlation between two sets of data can be weak, strong, negative, or positive, or in some cases there can be no correlation. The characteristics of the correlation between two sets of data can be readily seen from the scatter plot.

Comparing the above graphs to the following graphs where the is no correlation will make the concept easier to understand.

Line of Best Fit

A line of best fit is drawn on a scatter plot so that it joins as many points as possible and shows the general direction of the data. When constructing the line of best fit, it is also important to keep, approximately, an equal number of points above and below the line.

Example 3: For each of the following graphs, choose the line of best fit for each scatter plot and justify your choices.

Example 4: You decide to save all the pennies you accumulate in the span of 12 weeks. Each day you place your pennies in a jar and at the end of each week, you count all your pennies and record the number of pennies. Here are your results that you have recorded and that you have graphed.

|# of Weeks |1 |

|Month |Cost |

|1 |$135,890 |

|2 |$143,000 |

|3 |$131,790 |

|4 |$137,300 |

|5 |$134,130 |

|6 |$138,100 |

|7 |$136,900 |

|8 |$146,120 |

|9 |$141,890 |

|10 |$137,230 |

|11 |$144,000 |

|12 |$140,290 |

Solution:

Step 1: Open a blank spreadsheet in EXCEL.

Step 2: Enter the data as shown in the table and sort (original)

from least to greatest amount.

(sorted)

Step 3: Highlight the data without the headings and choose Insert / Chart from the pull down menu.

(Highlighted area in blue)

Step 4: Choose the XY Scatter (Scatter plot) option and the first Chart sub-type.

Step 5: Continue through the Chart Wizard selecting Next after every option that is defaulted, then place the chart as an object in your sheet (default). The plot should appear with your table of data when you select finish.

Step 6: Adjust your chart to the best position on your page (so you can see all of it and the data). A Chart Toolbar may also be in the way; move it to a convenient area or close it.

Step 7: Select the outside border of your chart/scatterplot to include all parts. Under the pull down menu labelled Chart, select Add Trendline.

Step 8: Choose the Linear option and click OK. A trendline should appear on the scatterplot. It is the line of best fit.

Step 9: To determine the formula for this line and predict next month’s spending costs, select the cell directly below #12 in your EXCEL data table. Add the number 13 to represent next month’s cost. Select the cell to its right which should be empty.

Step 10: Go to Insert and choose the Function option.

Step 11: Click and

notice that an equal

(=) sign will appear

in the highlighted

cell and in the

formula bar at the

top.

Select TREND and

press OK.

Step 12: Highlight the costs in your table (1 – 12) and that information will be placed automatically in the Known y’s box. Click on the New x’s box and type in the number 13. A formula will appear in the blank cell and in the formula bar. When you press OK, an amount will appear in the cell next to the number 13 in the table. That number represents the amount it would cost to run the company in one month.

Round it off to match the

costs in that column.

In the Chart Wizard, Step 5 above, you can name your scatterplot, label “x” and “y” , and change the look of many aspects of your chart or graph. Experiment with the changes to achieve the best presentation of your data.

Exercise:

1. The information in the table below shows the number of goals scored by the Toronto Maple Leafs over ten years from 1990 – 1999.

a. Determine how many goals they will probably score in the year 2000 if their goal scoring trend continues. Use an EXCEL spreadsheet.

b. Determine the trendline without sorting the data to check your answer. Does your number (answer to a.) fall on this line?

c. How does this number compare to the average (mean) number of goals scored ? Explain the difference.

|YEAR |GOALS SCORED |

|1990 |254 |

|1991 |278 |

|1992 |206 |

|1993 |244 |

|1994 |232 |

|1995 |256 |

|1996 |288 |

|1997 |239 |

|1998 |202 |

|1999 |248 |

|2000 |? |

Answers:

Lines of Best Fit

1. a. No, the line drawn is not a line of best fit because too many points are below

the line.

b. No, the line drawn is not a line of best fit because too many points are below

the line.

c. Yes, this is the line of best fit because the same number of points are above and below the line and it shows the general direction of the data.

d. No, this is not a line of best fit because it does not show the general direction.

2. a.

b.

c.

3. a) b)

c) The fuel consumption of a car travelling at a speed of 72 km/h is approximately

10 L.

d) The speed of a car that has a fuel consumption of 12 km/L is approximately

85 km/h

4. a) b)

c) The correlation is strong and positive.

d) The mark would be about 23.

e) The mark would be about 43.

Answers:

1. a. 236 goals

b.

Yes, it does.

c. 244.7 or 245 goals The mean is higher. A trendline is more valid as it averages the data in a different way. It takes into account the entire group of numbers but considers the last few as being the most significant.

Note: In the scatter plot the scale was changed on the “x” and “y” axes and titles were added to clarify information.

-----------------------

Line F

Line E

Both ’! These lines both show the general direction of the data and have equal number of points above and below the line.

Line A

Line BBoth → These lines both show the general direction of the data and have equal number of points above and below the line.

Line A

Line B

Line C

Line D

Line A → shows the general direction of the data and has the same number of points above and below the line.

Line B → does not show a general direction of the data.

Neither → Line F does not show a general direction of the data. Line E has too many points below it.

a.

b.

d.

c.

a.

b.

c.

|A |B |

|Month |Cost |

|1 |$131,790 |

|2 |$134,130 |

|3 |$135,890 |

|4 |$136,900 |

|5 |$137,230 |

|6 |$137,300 |

|7 |$138,100 |

|8 |$140,290 |

|9 |$141,890 |

|10 |$143,000 |

|11 |$144,000 |

|12 |$146,120 |

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download