Exercise 2F



4A – SCATTERPLOTSBivariate data result from measurements being made on each of the two variables for a given set of items.-14097036830Bivariate data can be graphed on a scatterplot (or scattergraph) as shown at left. Each of the data points is represented by a single visible point on the graph.When drawing a scatterplot, we need to choose the correct variable to assign to each of the axes. The convention is to place the independent variable on the x-axis and the dependent variable on the y-axis. The independent variable in an experiment or investigation is the variable that is deliberately controlled or adjusted by the investigator. The dependent variable is the variable that responds to changes in the independent variable.The relationship between two variables is called their correlation.Example 1The operators of a casino keep records of the number of people playing a ‘Jackpot’ type game. Thetable below shows the number of players for different prize amounts.Draw a scatter plot of the data (no calculator)Drawing conclusions/causationWhen data are graphed, we can often estimate by eye (rather than measure) the type of correlation involved. Our ability to make these qualitative judgements can be seen from the following examples, which summarise the different types of correlation that might appear in a scatterplot.Example 2Using the same data in the first example:Draw a scatter plot of the data using your CAS calculator.State the type of correlation that the scatterplot shows.Suggest why the plot is not perfectly linear.2F (Further Maths)Pearson’s Product - Moment Correlation Coefficient (r)A more precise tool to measure the correlation between the two variables is Pearson’s product-moment correlation coefficient (denoted by the symbol r). It is used to measure strength of linear relationships between two variables. The value of r ranges from 1 to 1. That is 1 r ≤ 1.Following is a gallery of scatterplots with the corresponding value of r for each. Exercise 2F4C – LINEAR MODELLINGIf a linear relationship exists between a pair of variables then it is useful to be able to summarise the relationship in terms of an equation. This equation can then be used to make predictions about the levels of one variable given the value of the other.The process of finding the equation is known as linear modelling. An equation can be found to represent the line which passes through any two points by using two coordinate geometry formulas.The gradient of the line, passing through (x1, y1) and (x2, y2) is given by:m=(y2-y1)(x2-x1)The equation of a straight line with the gradient m and passing through (x1, y1) is given by:y-y1=m(x-x1)y=mx-x1+y1Example 1Find the equation of the line passing through the points (2, 6) and (5, 12).To find the equation for a scatterplot that consists of many points we need to fit a straight line through the whole set of points.The process of fitting a line to a set of points is often referred to as regression. The regression line or trend line (also known as line of best fit) may be placed on a scatterplot by eye or by using the three-mean method (covered in exercise 3B). The line of best fit is the straight line which most closely fits the data.Ski Resort DataIts equation can then be found by using the method in the previous example by choosing any two points that are on the line.The y-intercept is the value of y when the level of x is zero, that is, where the line touches the y-axis. The gradient (slope) of the equation represents the rate of change of variable y with changing x.Sometimes after drawing a scatterplot it is clear that the points represent a relationship that is not linear. The relationship might be one of the non-linear types shown below.In such cases it is not appropriate to try to model the data by attempting to fit a straight line through the points and find its equation. It is similarly inappropriate to attempt to fit a linear model (straight line) through a scatterplot if it shows that there is no correlation between the variables.Example 2The following table shows the fare charged by a bus company for journeys of differing length.Represent the data using a scatterplot and place in the trend line by eye.Find an equation which relates fare, F, to distance travelled, d.Explain in words the meaning of the y-intercept and gradient of the line.Example 3The table below gives the times (in hours) spent by 8 students studying for a measurement test and the marks (in %) obtained on the test.Draw the scatterplot to represent the data. Use your Calculator.Using your calculator find the equation of the line of best fit. Write your equation in terms of the variables: time spent studying and test mark.4D – MAKING PREDICTIONSThe equation of the trend line may be used to make predictions about the variables by substituting a value into the equation.Example 1It is found that the relationship between the number of people playing a casino Jackpot game and the prize money offered is given by the equation N = 0.07p + 220, where N is the number of people playing and p is the prize money.Find the number of people playing when the prize money is $2500.Find the likely prize on offer if there were 500 people playing.Using technology:Alternatively, a prediction could be made from the graph’s trend line.Example 2The scatterplots below show the depth of snow and the corresponding number of skiers.From the graph’s trend line find:209554743453274060482600a) the number of skiers when snow depth was 3 m. b) the depth of snow that would attract about 400 people.Interpolation and Extrapolation2513330156845We use the term interpolation when we make predictions from a graph’s trend line from within the bounds of the original experimental data.We use the term extrapolation when we make predictions from a graph’s trend line from outside the bounds of the original experimental data.Data can be interpolated or extrapolated either algebraically or graphically.Reliability of ResultsResults predicted (whether algebraically or graphically) from the trend line of a scatterplot can be considered reliable only if:a reasonably large number of points were used to draw the scatterplot,a reasonably strong correlation was shown to exist between the variables (the stronger the correlation, the greater the confidence in predictions),the predictions were made using interpolation and not extrapolation. Extrapolated results can never be considered to be reliable because when extrapolation is used we are assuming that the relationship holds true for untested values.3A INTRODUCTION TO REGRESSION (Further Maths)The process of ‘fitting’ straight lines to bivariate data enables us to analyse relationships between the data and possibly make predictions based on the given data set.Regression analysis is concerned with finding these straight lines using various methods so that the number of points above and below the line is ‘balanced’.3A Method of Fitting Lines by EyeThere should be an equal number of points above and below the line.327660051435Example 1:Fit a straight line to the data in the figure usingthe equal-number-of-points method.Exercise 3A1. Fit a straight line to the data in the scatterplots using the equal-number-of-points method.3B Fitting a straight line — the 3-median methodFitting lines by eye is useful but it is not the most accurate of methods.We can find the line of best fit in the form of ___________________________________One method to find the line of best fit is called the 3-median method.This method is as follows:Step 1.Plot the points on a scatterplot.Step 2. Divide the points into 3 groups (lower, middle and upper) using vertical divisions (a) If the number of points is divisible by 3, divide them into 3 equal groups (b) If there is 1 extra point, put the extra point in the middle group (c) If there are 2 extra points, put 1 extra point in each of the outer groupsStep 3. Find the median point of each of the 3 groups and mark each median on the scatterplot (the median of the x-values and the median of the y-values in the group).(a) The median of the lower group is denoted by (b) The median of the middle group is denoted by (c) The median of the upper group is denoted by Note: Although the x-values are already in ascending order on the scatterplot, the y-values within each group may need re-ordering before you can find the median.Steps 4 and 5 can be completed using 2 different approaches; graphical or arithmeticGraphical approachStep 4. Place your ruler so that it passes through the lower and upper medians. Move the ruler a third of the way toward the middle group median while maintaining the slope. Hold the ruler there and draw the line.Step 5. Find the equation of the line (general form y ??mx ??c). There are two general methods.(a) Method A: Choose two points which lie on the line and use these to find the gradient of the line and then the equation of the line. Substitute the coordinates of one point and m into the equation to find c(b) Method B: If the scale on the axes begins at zero, you can read off the y-intercept of the line and calculate the gradient of the line. Arithmetic approachStep 4. Calculate the gradient (m) of the line. Use the rule: Step 5. Calculate the y-intercept (c) of the line. Use the rule: Thus, the equation of the regression line is y ??mx ??c.Example 1: Find the equation of the regression line for the data in the table at right using the 3-median method. Give coefficients correct to 2 decimal places.Sketch the scatterplot then divide it into 3 groups.Using graphical approach to find the equation for the line of best fit.Using arithmetic approach to find the equation for the line of best fit.Find the gradient of the lineFind y-interceptFind the equation of the lineCAS CALCULATORFitting a Straight Line Using the 3 Median MethodExample 2166687540386000Find the equation of the regression line for the data in the table below using the 3-median method. Give coefficients correct to 2 decimal places.34861504762500On a Lists & Spreadsheet page, enter x-values into column A and y-values into column B. Label the columns accordingly.34861507810500To draw a scatterplot of the data, add a Data & Statistics page. Tab e to each axis to select ‘Click to add variable’. Place x on the horizontal axis and y on the vertical axis.348551515621000The graph should appear as shown. If you move the pointer lover any point and press Click x twice, the coordinates for that point will be displayed.348869015113000To fit a regression line, complete the following steps. Press: MENU b4: Analyse 46: Regression 63: Show Median–Median 3Exercise 3B3C Fitting a straight line — least-squares regressionAnother method for finding the equation of a straight line which is fitted to data is known as the method of least-squares regression. It is used when data show a linear relationship and have no obvious outliers.To understand the underlying theory behind least-squares, consider the regression line shown below.11049009334500We wish to minimise the total of the vertical lines, or ‘errors’ in some way. For example, balancing the errors above and below the line. This is reasonable, but for sophisticated mathematical reasons it is preferable to minimise the sum of the squares of each of these errors. This is the essential mathematics of least-squares regression.Choosing Between 3-Median and Least –Squares RegressionThe 3-median method should be used in preference to least-squares regression method if there are clear outliers in the data Calculating the least-squares regression line by handSummary data needed: the mean of the independent variable (x-variable) the mean of the dependent variable (y-variable)sx the standard deviation of the independent variablesy the standard deviation of the dependent variabler Pearson’s product–moment correlation coefficient.Formula to use:The general form of the least-squares regression line is Where the slope of the regression line is the y-intercept of the regression line is Example 3: For the given table below, find the regression line using the least square method.x134710121415y1191068431To find x, y, sx, sy and r from two set of data using CASMENU b4: Statistics 41: Stat Calculations 12: Two-Variable Statisticsx= ____________y= ____________sx = ___________sy = ____________r = ____________Find the gradient of the regression lineFind the y-intercept of the regression lineFind the equation of the regression lineExample 4:A study to find a relationship between the height of husbands and the height of their wives revealed the following details.Mean height of the husbands: 180 cmMean height of the wives: 169 cmStandard deviation of the height of the husbands: 5.3 cmStandard deviation of the height of the wives: 4.8 cmCorrelation coefficient, r ??0.85The form of the least-squares regression line is to be: Height of wife ??m ×?height of husband ???cWhich variable is the dependent variable? ______________________________Calculate the value of m for the regression line (to 2 decimal places). Calculate the value of c for the regression line (to 2 decimal places).Use the equation of the regression line to predict the height of a wife whose husband is 195 cm tall (to the nearest cm).The calculation of the equation of a least-squares regression line is simple using a CAS calculator.Example 5:A study shows the more calls a teenager makes on their mobile phone, the less time they spend on each call. Find the equation of the linear regression line for the number of calls made plotted against call time in minutes using the least-squares method on a CAS calculator. Express coefficients correct to 2 decimal places.Number of minutes (x)134710121415Number of calls (y)1191068431428053524765On a Lists & Spreadsheet page, enter the minutes values into column A and the number of calls values into column B. Label the columns accordingly.427672519240500To draw a scatterplot of the data in a Data & Statistics page, tab e to each axis to select ‘Click to add variable’. Place minutes on the horizontal axis and calls on the vertical axis. The graph will appear as shown.425767531242000To fit a least-squares regression line, complete the following steps. Press: MENU b4: Analyse 46: Regression 61: Show Linear (mx+b) 142570408509000To find r and r2, return to the Lists & Spreadsheet page by pressing Ctrl/and then the left arrow ? Summary variables are found by pressing:MENU b4: Statistics 41: Stat Calculations 13: Linear Regression (mx+b) 3425767536258500Complete the table as shown below and press OK to display the statistical parameters. Notice that the equation is stored and labelled as function f1.The regression information is stored in the first available column on the spreadsheet.Exercise 3C3E Residual analysisThere are situations where the mere fitting of a regression line to some data is not enough to convince us that the data set is truly linear. Even if the correlation is close to ?1 or – 1 it still may not be convincing enough.The next stage is to analyse the residuals, or deviations, of each data point from the straight line.A residual is the vertical difference between each data point and the regression line.center3810When we plot the residual values against the original x-values and the points are randomly scattered above and below zero (x-axis), then the original data is most likely to have a linear relationship.13049255842000322834036449000030734000If the residual plot shows some sort of pattern then the original data probably is not linearResidual PlotTo produce a residual plot, carry out the following steps:Step 1.Draw up a table as followsx12345678910y56815244777112187309ypredResiduals(yypred)Step 2.Find the equation of the least-squares regression line y = mx + b using the graphics calculator.Step 3.Calculate the predicted y-values (ypred) using the least squares regression equation. The predicted y-values are the y-values on the regression line.Put these values into the table.Step 4.Calculate the residuals.221932522161500169545022161500Residual value = y - ypred actual data value y-value from the regression lineEnter these values into the table.Note: the sum of all the residuals will always add to zero (or very close).Step 5.Plot the residual values against the original x-values.If the data points in the residual plot are randomly scattered above and below zero (the x-axis), then the original data will probably be linear.If the residual plot shows a pattern then the original data is not linear.Example 8Use the data below to produce a residual plot and comment on the likely linearity of the data.Step 1.x1234y56815ypredResidual (y – y pred)5678910244777112187309Step 2.Equation of the least-squares regression line.y = ax + bStep 3.Calculate the predicted y-values using the equation _________________________________When x = 1 ypred = = =When x = 2 ypred ===Or use the CAS calculator to get the ypred values from the regression line by opening a Graphs & Geometry page and enter the equation of the least-squares regression and press enter.Once you have the graph press Menu b, 5: Trace 5 1: Graph Trace 1. Type in the x value and the corresponding y value will appear.Step 4.Calculate the residuals.Residual = y ypredResidual = = = Residual = = =Calculate the rest of the residuals and enter them into the table. Add all residuals to check it equals zero.657225250825Residual00ResidualStep 5.Plot residual values against original x-values. The residual plot shows 3728085137160Using a CAS calculatorFind the equation of a least-squares regression line.Enter the data on a Lists & Spreadsheet page.To find the values of m and b for the equation y = mx + b pressMENU b4: Statistics 41: Stat Calculations 13: Linear Regression ( mx + b) 33728085217805 To generate the residual values in their own column, move to the shaded cell in column E and press: Ctrl /MENU b4: Variables … 43: Link To: ? 3Select the list stat6.resid3728085184785Write down all of the residuals displayed in the column. Scroll down for the complete list of values. Note: The stat number will vary depending on the calculator and previously stored data.3724275101600Example 9Using the same data as in Worked example 8, plot the residuals and discuss the features of the residual plot.Generate the list of residuals as demonstrated in Example 8.On the Data & Statistics page select x for the x-axis and stat.resid for the y-axis.37236409969500To identify if a pattern exists, it is useful to join the residual points. To do this, press: MENU b2: Plot Properties 21: Connect Data Points 1Exercise 3E ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download