Re-Expressing Data (Introduction)

[Pages:6]Re-Expressing Data (Introduction)

We address now what to do when the analysis of the residues establishes that the regression used is not a good one.

Example 7. Some 9th grade students are studying free falling objects. They have collected the following data in table 2. Find:

a) the regression line and graph it with the scatter plot, b) estimate the distance travel by the object in 1sec., c) how good is this model? (justify your answer), d) find the residual plot, what does this plot suggest?

Table 8. Time/distances in experimental free fall of an object

Time (sec) Distance (cm) Time (sec) Distance (cm)

.16

12

.57

150.3

.24

29.7

.61

182.2

.25

32.8

.61

190

.30

42.9

.68

220

.31

44.5

.71

254.3

.32

55

.72

260.9

.36

63.9

.83

334.5

.36

65.2

.88

376

.51

125.6

.89

338.7

.51

129.5

1

?

Figure

Solution. C oe &!#?!"B "!'?%&, The following is a graph of this line superimposed to the scatter plot. In 1 sec would travel: C?"? oe $*&?' The data points seem to be fairly close to the graph, so it does look like a decent fit. But, how good is it?

Figure

Figure shows the residual plot for this data. We can see that there is a definite pattern. Thus we know that our original fitted line is not a correct model. We must know re-express the data to

find the best fitting curve.

This can be done by taking the square roots of each C value transforming the data to the form ?B? ?C?.

Table 9. Reexpressed distances in experimental free fall of an object

Time

?Distance Time

?Distance

(sec)

(cm) (sec)

(cm)

.16

3.4'%" .57

12.26

.24

5.4%*) .61

13.%*)

.25

5.7#(" .61

13.()%

.30

6.54*) .68

14.8$#

.3"

6.6(!) .7"

15.94(

.32

7.4"'# .72

16.15#

.36

7.9*$( .83

18.2)9

.36

8.07%( .88

19.3*"

.5"

11.#!( .89

18?%!%

.5"

11.3)

The new line of fit is thus ?C oe #"?&$B !?#(

Find the scatter plot & the residual plot of this transformed data.

The residual points of the transformed data being scattered randomly about the horizontal axis, is an indication of a good fit.

It is often the case when the relationship is quadratic that the size of the residuals seem to increase as the independent value increases.

To undo the transformation so that we can determine an equation for a free-falling object, we have to square both sides of the line of fit of the re-expressed data.

Therefore, we obtain the equation:

??C oe #"?&$B !?#(?# ? C oe %*#?%!B# &?$$B !?!"

Re-Expressing Data (Theory)

First, we transform the data in such a way that it becomes linear. Then we fit a least squares or a median-median line to the transformed data and, if satisfied with the fit, we undo the transformation and change the variables back to their original state.

Regression Models ?

??????????? Linear:

C

oe

7B

,oe

P/+=> W;?+? C??

Reg. Line C oe +> ,?

Undo C oe + 68 B ,

Expntl. ?B? 68 (C 5?? oe ?B? A?? A oe +B ,?

(Horiz. Asymp? C oe 5?

68?C 5? oe +B , ?

C oe 5 /+B, ?

C oe 5 /,?/+?B ?

C oe 5 - .B

Power

?68 B? 68 C? oe ?>? A?? A oe +> ,? 68 C oe + 68 B , ?

68

C B+

oe

,

?

C oe /, B+ oe -B+

Example 8. (Population Growth) The following table shows how the

population per square mile in the United States has changed over a

period of years since 1)!!. What was the population density in "))(?

What will the population density be in 2025?

Table 10. Population

density in the USA.

Year People/mi#

"(*!

%?&

")!!

'?"

Solution. Population growth is exponential! Hence we have some idea of how to re-express and transform the data to a linear model.

")"!

%?$

")#!

&?&

")$!

(?%

")%!

*?)

The ordered pairs ?B? C? are indeed on an exponential curve, if and only if the ordered pairs (B? 68 C? are on a straight line.

")&!

(?*

")'!

"!?'

Let Bw oe B "()! and A oe 68C

")(!

"!?*

"))!

"%?#

The equation of the least squares line is

")*!

"(?)

A oe !?!"&Bw "?#4

"*!!

#"?&

"*"!

#'

or 68 C oe !?!"&Bw "?#4

"*#!

#*?*

"*$!

$%?(

"*%!

$(?#

To find a model for the original data ?B? C? we exponentiate and simplify as follows:

"*&!

%#?'

"*'!

&!?'

"*(!

&(?&

"*)!

'%

C oe /?!?!"&Bw"?#%? ? C oe /!?!"&Bw /"?#% C oe $?%'/!?!"&Bw

where Bw represents the number of years since 17)0 or equivalently

C oe $?%'/!?!"&(B"()!?.

Hence, C oe $?%'/!?!"&?"))("()!? ? "(?## people per square mile.

Similarly, C oe $?%'/!?!"&?#!#&"()!? ? "$'?%* people?square mile?

Let P" and P# be the two lists of collected data, with the

independent variable B9 data being in P"and the dependent

variable C9 data in P#? Assume that we have a quadratic relationship, i.e., C9 oe B9#. In order to linearize the data we have two choices:

"? Let B8 oe B9 and C8 oe ?C9, find LSL (P"? ?P#?

o?btCa9inoein+gB9C8 oe,

+B8 ,. Then replacing B8 & C8we get ? C9 oe ?+B9 ,?# oe -B#9 .B /

#? Let B8 oe B9# and C8 oe C9, find LSL?P"#? P#? obtaining C8 oe +B8 ,. Then replacing B8 & C8we get C9 oe +B9# , We see that the first approach produces a full quadratic with x-

term which in general will be more precise.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download