Excel Lab 3: Linear Regression

Excel Lab 3: Linear Regression

When working with data sets, the term ¡°regression¡± (also called ¡°least squares fitting¡±)

refers to determining the formula of a function that relates the data quantities in a way

that is consistent with the given data. In this assignment, you will learn how to use some

of Excel¡¯s built-in commands to perform regression for a linear model. Open a new Excel

workbook, and perform the steps in this tutorial as we go.

Example:

(Fitting a linear function model to a data set.)

Here is a data set resulting from the following experiment: We start with six trees of

different heights. We measure and record the height of each tree, then measure and

record the heights again after one year.

Tree

1

2

3

4

5

6

Initial heigh

(feet)

x

23.1

18.7

20.6

16.0

32.5

19.8

Final height

(feet)

y

24.1

19.8

21.5

17.0

33.6

20.6

We begin by entering this data set into an Excel spreadsheet, and producing a scatter plot

of the data. (NOTE: In this table, the first column is NOT data ¨C it is just a bookkeeping

index.) Here is what we get:

The points look fairly linear, so we try to find a regression line, i.e. a function of the

form

y=mx+b,

where we have to figure out some ¡°good¡± values of m and b. As noted in class, Excel

has two commands, logically called SLOPE and INTERCEPT, that we can use to

compute values for the slope, m, and the intercept, b, for our data set.

Using the SLOPE command:

To compute the slope m of the regression line, we do the following in Excel:

1. Click on an empty cell, where you want the slope value to be, and label an

adjacent cell with the word ¡°slope¡±. We have this:

2. In the selected cell, type

=SLOPE(

3. At this point, you should see a little ¡°hint¡± window (just below our selected cell in

the image below).

This ¡°hint¡± window tells us that the SLOPE command takes two arguments. The first

is the list of dependent variable values, and the second is the list of independent

variable values. (Note that this is backwards from the ¡°usual¡± thinking, where

independent comes first, and dependent comes second.)

4. The easiest way to finish inputting the SLOPE command is to first use the cursor

and the left mouse button to select the column of dep. var. values. When you

make the selection this way, the formula fills in automatically, like this:

Notice how the blue ¡°E5:E11¡± (which is how Excel describes a range of values, using

a colon) has been added to the SLOPE formula.

5. Next, type a comma after the blue ¡°E5:E11¡±. Then, repeat Step 4, this time with

the column of indep. var. values. This gives us this:

Again, this automatically enters the correct range of values (¡°D5:D11¡±, in this case) into

the SLOPE formula.

6. The last thing to do is type a closing parenthesis after the green ¡°D5:D11¡±, then

press the ¡°Enter¡± key. You should now see the value of the regression line in the

box where the formula was entered.

Using the INTERCEPT command:

To compute the intercept b of the regression line, we do exactly the same steps that we

did for SLOPE, only substitute INTERCEPT for SLOPE in the formula. When finished,

the formula will read

=INTERCEPT(E5:E11,D5:D11).

(NOTE: You can always view the formula for a particular cell (assuming it has one) by

just double-clicking on the cell.)

After doing this, our Excel spreadsheet looks like this:

So, for this data set, the regression line (or ¡°best fit linear model¡± or ¡°least squares line¡±)

is given by the formula

y=1.01x + 0.81

Comparing values of the regression line model to the data:

We now add a new column to our data set, so we can list the values predicted by the

regression line model. After labeling the column, go to the cell next to the first data point,

and enter, in Excel cell-address form, the above formula for the regression line. For the

first data row (Row 6 in this case), the x-value is in cell D6, the slope-value is in cell

C15, and the intercept-value is in cell D15. Thus, the formula for the regression line is

written

=C$15*D6+D$15

So, this is what you enter in the cell. Then, use the ¡°select and drag¡± method to extend the

calculation to the other data points. Here is what you should see:

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download