ED 793 LAB #1



ED 795 LAB #

Outline

1. Dummy Coding

2. Interactions

3. Plots in Excel

Including dummy variables in a model

If your nominal variable has n recoded dummy variables, you will include (n-1) or the dummy variables in your model. The dummy variable that is left out is called the “reference group”. The interpretation of the dummy variables (coefficients) that are included in the model is always the difference between that group and the “reference group”.

For example” If I recode a nominal political orientation variable into 3 dummy variables, liberal, middle of the road, and conservative, I would only include 2 of the 3 in my model. If I leave out conservatives than the interpretation of the liberal coefficient is the difference between the liberals and the conservatives. In other words, you are looking at the difference between 2 predicted means holding all other variables constant.

Example Syntax:

Recode polivw86 (1,2=1) (else=0) into conserv.

Recode polivw86 (3=1) (else=0) into middle.

Recode polivw86 (4,5=1) (else=0) into liberal.

compute promrac=goal8617.

recode race (1=0) (else=1) into racen.

frequency var= liberal middle conserv.

**Include only two of the three (liberal, middle, conserv) in the model.

regression var=sex86 racen income liberal middle promrac

/dependent promrac

/method enter sex86 racen income

/method enter liberal middle.

Example Output:

Multiple R .27621

R Square .07629

Adjusted R Square .07512

Standard Error .82205

Analysis of Variance

DF Sum of Squares Mean Square

Regression 5 219.69045 43.93809

Residual 3936 2659.80930 .67576

F = 65.01982 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

SEX86 .074555 .026529 .043616 2.810 .0050

RACEN .477042 .034157 .215660 13.966 .0000

INCOME -.002520 .004154 -.009396 -.606 .5442

LIBERAL .324970 .038077 .162793 8.535 .0000

MIDDLE .036793 .032845 .021505 1.120 .2627

(Constant) 1.853578 .059049 31.390 .0000

Interpretation:

Holding all other variables constant, the mean predicted value for promote racial understanding is .32 higher for liberals than for conservatives. The mean predicted value for middles is not significantly different than conservatives.

NOTE: The difference between middle and liberal is just the difference between the two coefficients. βlib-βmid = (ypredlib-ypredcon) – (ypredmid – ypredcons) = ypredlib-ypredmid.

You cannot, however, test the significance of this difference without rerunning the analysis unless you want to compute the test statistic by hand.

Interactions

Example Syntax:

compute sexrace=sex86*racen.

compute racelib=racen*liberal.

compute sexlib=sex86*liberal.

regression var=sex86 racen income liberal middle promrac sexrace racelib

/dependent promrac

/method enter sex86 racen income

/method enter liberal middle

/method enter sexrace racelib sexlib.

Example Output:

Multiple R .28274

R Square .07994

Adjusted R Square .07807

Standard Error .82074

Analysis of Variance

DF Sum of Squares Mean Square

Regression 8 230.19018 28.77377

Residual 3933 2649.30957 .67361

F = 42.71575 Signif F = .0000

------------------ Variables in the Equation ------------------

Variable B SE B Beta T Sig T

SEX86 .055029 .032713 .032193 1.682 .0926

RACEN .708939 .110463 .320495 6.418 .0000

INCOME -.002501 .004148 -.009328 -.603 .5466

LIBERAL .025126 .098925 .012587 .254 .7995

MIDDLE .042901 .032921 .025075 1.303 .1926

SEXRACE -.160566 .068067 -.118567 -2.359 .0184

RACELIB .037197 .075994 .009730 .489 .6245

SEXLIB .197655 .061505 .160431 3.214 .0013

(Constant) 1.880404 .065075 28.896 .0000

Interpretation:

The interaction between race and liberal is not significant meaning the effect of race on promote racial understanding DOES NOT depend on your liberal political affiliation or vice versa… the effect of a liberal political view on promoting racial understanding DOES NOT depend on your race.

HOWEVER, the interaction between sex and race is significant meaning the effect of sex on promoting racial understanding DOES depend on your race and vice versa… the effect of race on promoting racial understanding DEPENDS on your sex. Likewise the effect of sex on promoting racial understanding DOES depend on your liberal political view and vice versa.

Now let’s figure this all out. Note that to simplify the calculations, I only included the significant predictors and main effects for whose interaction was significant.

PROMOTE RACIAL UNDERSTANDING SCALE:

1=not important 2=somewhat important 3=very important 4=essential

Ypred = 1.9 + .06*sex86 +.7*racen +.03*liberal -.2*sexrace + .2*sexlib

For White Male Conservatives: Ypred=1.9 + .06*1 = 1.96

For White Male Liberals: Ypred=1.9 + .06*1 + .03*1 + .2*1 = 2.1

For Nonwhite Male Conservatives: Ypred=1.9 + .06*1 +.7*1 -.2*1 = 2.46

For Nonwhite Male Liberals: Ypred=1.9 + .06*1 + .7*1 + .03*1 - .2*1 = 2.49

For White Female Conservatives: Ypred=1.9 + .06*2 = 2.02

For White Female Liberals: Ypred= 1.9 +.06*2 + .03*1 +.2*2 = 2.45

For Nonwhite Female Conservatives: Ypred = 1.9+.06*2 + .7*1 -.2*2 = 2.32

For Nonwhite Female Liberals: Ypred = 1.9 + .06*2 + .7*1 + .03*1 -.2*2 + .2*2 = 2.75

Interaction Plots:

[pic]

[pic]

HOMEWORK:

For this lab, create a multiple regression model with a continuous outcome and 4-5 independent variables. Be sure to include some dummy variables and at least one interaction term in the model that you can graph.

For the write up, do NOT include the intro, lit review and sample unless it is completely new to me. Concentrate on your methodology and results section. Be sure to interpret all of the coefficients and pay particular attention to the interaction coefficients. Your interaction coefficient may not be significant, that is ok. Make one graph that looks at the interaction between groups.

SUPPLEMENTAL HANDOUT (FYI)

CREATING DATA FILES

Consider a student background survey that was given to a statistics class at the beginning of the term. After students filled out the survey, how were the responses turned into usable data? A data set had to be created that SPSS can read and manipulate. How is this done? Although there are many different ways to make a data set for SPSS, the following types of files are the most popular.

▪ ASCII or text format data file (free and fixed)

▪ SPSS system format data file

On the background survey, students were asked to rate their familiarity with several types of computer environments including Macintosh, MS Dos, MS Windows, and UNIX. We know that Level of Familiarity with UNIX, for example, is a variable in that its value can vary from student to student. The computer environment familiarity variables were measured using a scale where the choices were no experience, some experience, and expert. Since there is an inherent order to this scale, these variables are ordinal.

Suppose we want to create a data file for the familiarity with computer environments questions. First, we need to assign numbers to each of the response categories. In this case, we assign 1 to the category, no experience; 2 to the category, some experience; and 3 to the category, expert. Second, we decide on the format of the data file we want. For this data, we use the ASCII format or text format. In PICO, we create a file that looks like one of the following.

3221 3 2 2 1

2331 2 3 3 1

2322 2 3 2 2

2221 OR 2 2 2 1

32 1 3 2 -1 2

3221 3 2 2 1

3223 3 2 2 3

On the left, the file is in fixed format. This means that variables are defined by fixed columns. In our case, values for the first variable are in column 1, values for the second variable are in column 2, and so on. In this format, a blank or space represents a missing piece of data. Note that in some cases, values for a variable might need more than one column. For example, SAT math score would need 3 columns.

On the right, the file is in free format. This means that spaces (could use tabs or commas instead) separate the values for successive variables in a given row. In this case, missing values must be assigned a value. In the example above, -1 is assigned for missing values.

Important note 1: For the most part, we will be working with existing data files this term. However, if you plan to do your own data collection in the future, like for a dissertation, you will be creating data files.

Important note 2: ASCII and text format data files end in .dat and SPSS data files end in .sav.

THE FIRST COMMAND FILE FOR YOUR DATA

We have just made a data file. Now what? It is time to make a command file consisting of a series of SPSS commands and subcommands that will give definition to the numbers we typed into the data file.

Data definition commands

▪ DATA LIST: This “defines” a raw data file by assigning names and formats to each variable in a file.

DATA LIST FILE=’/afs/umich.edu/class/ed793/lab2000/d_fixed.dat’ FIXED

/MAC 1 MS_DOS 2 MS_WIN 3 UNIX 4.

▪ VARIABLE LABELS: This assigns descriptive labels to variables in the working data file.

VARIABLE LABELS

MAC ‘Macintosh’

MS_DOS ‘Microsoft Dos’

MS_WIN ‘Microsoft Windows’

UNIX ‘Unix system’.

▪ VALUE LABELS: This assigns descriptive labels to the specified values of given variables.

VALUE LABELS MAC OR VALUE LABELS MAC MS_DOS MS_WIN UNIX

1 ‘No experience’ 1 ‘No experience’

2 ‘Some experience’ 2 ‘Some experience’

3 ‘Expert’ 3 ‘Expert’.

/MS_DOS

1 ‘No experience’

2 ‘Some experience’

3 ‘Expert’

/MS_WIN

1 ‘No experience’

2 ‘Some experience’

3 ‘Expert’

/UNIX

1 ‘No experience’

2 ‘Some experience’

3 ‘Expert’.

▪ SAVE: This saves the file into an SPSS data file.

SAVE FILE=’survey1.sav’.

EXAMPLE COMMAND FILE

SET HIGHRES=OFF.

SET WIDTH=80.

DATA LIST FILE='/afs/umich.edu/class/ed793/lab2000/lab793.dat' FREE

/ ID SEX GRE_MATH PRE_STAT MAJOR PRETEST FINAL EXP_LAB.

MISSING VALUE ALL(-1).

VARIABLE LABELS

ID 'Identification number'

SEX 'Student gender'

GRE_MATH 'GRE math score'

PRE_STAT 'Statistic courses previously taken'

MAJOR 'Student major'

PRETEST 'Pretest score'

FINAL 'Final exam score'

EXP_LAB 'Experimental lab'.

VALUE LABELS

SEX

0 'Male'

1 'Female'

/PRE_STAT

0 'No'

1 'Yes'

/MAJOR

1 'Humanities'

2 'Natural sciences'

3 'Social sciences'

4 'Other'

5 'Not known at this time'

/EXP_LAB

1 'Yes'

2 'No'.

SAVE FILE=’dataset1.sav’.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download