CLRES 2020, Lab 3



CLRES 2020, Lab 3

Tuesday 2pm-5pm July 26 2004

GSCC 126

Instructors:

Joyce Chang, PhD

Maria Mor, PhD

Doris Rubio, PhD

Teaching Assistants:

Fiona Callaghan MS

Bill Clark

David Corcoran

Vinay Mehta

Goals for Lab 3

1. More One-sample t-tests

We are using the dataset brain.dta

Whenever you see a check-mark [pic] that means that you are required to perform some action. Whenever some words are in this font it means that these

are commands that you should type in the command window of STATA. And whenever you see an > it refers to going to a series of drop-down windows, as in

“All Programs>Mathematics>STATA”. There are generally two ways to do most things in STATA: using commands that you type in the command window, or using drop-down menus, as in SPSS. Whenever possible, we will give you both ways of doing things in STATA, but you are only required to do it the way you feel most comfortable. On the back of this handout is some space for you to answer questions about the lab material.

The questions that you have to answer to get credit for this lab are enclosed in a box like this.

You will answer these questions as you go through the lab and hand them in at the end for credit, so remember to write your name on them! If you experience trouble at any time, just raise your hand to let a TA or an instructor know that your need help. Let’s get started!

Getting Started

First we will log on to the computer. To do this you will need your University of Pittsburgh user id and your password.

✓ You should see a space on the screen to enter your user id. Type it in and press return.

✓ Now enter your password and press return. You should now be logged on to the computer.

We will open a folder in which to save our work, and then we will open STATA and enter our data sets into STATA.

✓ Right-click somewhere on the desktop and select “New Directory”. Name your folder “Lab3”. We will save all our work in this folder.

✓ Go to the web page:

✓ Scroll down to find the data sets and right-click on “brain.dta” and select “Save Link As…”.

✓ We want to save the file in “/scratch/username/Desktop/Lab3”. The “username” is your University of Pittsburgh email id (the part of your University of Pittsburgh email address that comes before the “@” e.g. “fmc2” is the id from the email address fmc2@pitt.edu), so on my computer I would save it in “/scratch/fmc2/Desktop/Lab3”. To do this, double click on “Desktop” and then “Lab3” in the main window (you should only have to do this once; the computer will remember where you are saving your files later on). Click “Save”.

✓ Your data sets should now be in your “Lab3” folder on the Desktop. Open up your “Lab3” folder to check that it is there, by double clicking on the “Lab3” icon on your desktop. If things do not look right, contact a TA.

Now we will open STATA.

✓ To open STATA, click on the icon in the bottom left of your screen (this is the “Start Applications” menu) and go up to “Mathematics” and then move the mouse right onto “STATA” to highlight it. Click on STATA and it should open.

✓ We wish to tell the STATA to save anything we do from now on in our “Lab3” file. To do this, in the command window type: cd “/scratch/username/Desktop/Lab3”

✓ Now open the log file. Type log using log3.log or you could go to File>Log>Begin… . You will have to give the log file a name, so type in “log3”. Next we have to make sure that STATA saves it as a “.log” file and not a “.smcl” file; go to the drop down menu next to “Save as type: Stata SMCL Document (*.smcl)” and select “Stata log (*.log)”. Then save in your Lab3 folder (you may have to double click on Desktop to find the Lab3 folder).

✓ Type use brain in the command window of STATA, and press return. You can also enter your data using a drop down window. Go to “File>Open…” and select the brain.dta data set and click “Open”. Your data set should now be in STATA.

✓ You should see some words in the “Variables” window -- fsiq viq piq weight height mri_count gender -- Click on the Data Editor button (or type edit in the command window). You should see 7 columns of numbers and some labels at the top of those columns. Click on the red button with the white cross at the top right of the screen [pic] to get rid of the Data Editor window. If your data does not look right, ask a TA for help.

About the Data

Datafile Name: Brain size

Datafile Subjects: Medical

Story Names: Brain Size and Intelligence

Reference: Willerman, L., Schultz, R., Rutledge, J. N., and Bigler, E. (1991), "In Vivo Brain Size and Intelligence," Intelligence, 15, 223-228.

Authorization: Contact authors

Description: Willerman et al. (1991) collected a sample of 40 right-handed Anglo introductory psychology students at a large southwestern university. Subjects took four subtests (Vocabulary, Similarities, Block Design, and Picture Completion) of the Wechsler (1981) Adult Intelligence Scale-Revised. The researchers used Magnetic Resonance Imaging (MRI) to determine the brain size of the subjects. Information about gender and body size (height and weight) are also included. The researchers withheld the weights of two subjects and the height of one subject for reasons of confidentiality.

Number of cases: 40

Variable Names:

1. Gender: Male (=1) or Female (=2)

2. FSIQ: Full Scale IQ scores based on the four Wechsler (1981) subtests

3. VIQ: Verbal IQ scores based on the four Wechsler (1981) subtests

4. PIQ: Performance IQ scores based on the four Wechsler (1981) subtests

5. Weight: body weight in pounds

6. Height: height in inches

7. MRI_Count: total pixel Count from the 18 MRI scans

For this lab we will only use the variables “fsiq”, “gender” and “mri_count”. Our basic research questions for this lab are:

a) Are the brain sizes (mri_count) of these students significantly greater than the rest of the population? Are the brain sizes (mri_count) of these students significantly less than the rest of the population? Are the brain sizes (mri_count) of these students significantly different to the rest of the population?

b) Are the IQ’s (fsiq) of these students significantly greater than the rest of the population? Are the IQ’s (fsiq) of these students significantly less than the rest of the population? Are the IQ’s (fsiq) of these students significantly different to the rest of the population?

We will use a one-sample t-test to answer these questions. I will outline how to do these tests for the brain size (mri_count) and the questions for this lab will ask you to test the IQ.

Note that in “real life” you would only choose ONE of “greater”, “less” or “different to” depending on your research question, but here you will calculate all three tests as an exercise.

Summary Statistics

Before we get into analyzing the data, we should always do some summary statistics and find out some basic facts about the data. The following commands will help you answer the questions below.

✓ Type summarize and press enter

Question 1: What is the mean and standard deviation for “fsiq” and “mri_count”?

Question 2: Do you have any missing values? If so, which observations and which variables?

✓ Type graph box fsiq

✓ Type graph box fsiq, by(gender)

✓ Type graph box mri_count

✓ Type graph box mri_count, by(gender)

Question 3: Do you have any outliers, or other strange values for fsiq or mri_count?

✓ Type histogram fsiq, normal

✓ Type histogram fsiq, normal by(gender)

✓ Type histogram mri_count, normal

✓ Type histogram mri_count, normal by(gender)

Question 4: Which variables are continuous and which are discrete (out of fsiq, mri_count and gender)?

Question 5: For the continuous variables, are the data normally distributed? If not, are they skewed left or right, or are they non-normal for some other reason?

Whatever your answer for the above questions, we will proceed to do t-tests as if the data is normally distributed and we found no problems. This is because we are in a class and we need to practice t-tests as an exercise. In “real life” we would not proceed (or consult a statistician for what to do!) if, for example, the data was not normally distributed or we had too many missing values or extreme values.

More One-sample t-tests

This is the decision rule that you must remember:

If our p-value is less than α, then we say that we “reject” the null hypothesis. If our p-value is larger (or equal to) the α level, then we “fail to reject” the null hypothesis.

In the following examples we will do one sample t-tests comparing the brain size to some benchmark figure. Suppose someone tells us that 890,000 pixels is considered a “normal sized” brain. We wish to test if our sample of students have unusually large or small brains compared to the general population. There are 3 different ways of formulating this question, which give the 3 different kinds of Ha.

Example 1

Is the average brain size of the students greater than 890,000 pixels? (Use an α = 0.10).

▪ From the question we know Ha: μ > 890,000. The null hypothesis could be Ho: μ = 890,000 or Ho: μ ≤ 890,000, depending on what we, as clinicians, think is more reasonable. The form of Ho does not alter our calculations, just the interpretation. In general, choose the Ho that seems the most conservative. Let’s say I don’t think that it is possible that the average could be less than 890,000, so I choose Ho: μ = 890,000.

▪ We do not know the population standard deviation σ, so if we were calculating the test by hand we would use the formula:

t = (Sample mean – μo )/(s/√n)

▪ We know: Sample mean = 908755, μo = 890000, s = 72282.05, n = 40. We “convert” our sample mean into a “t-score” (kind of like we did with the normal distribution): t = (908755-890000)/(72282.05/√40) = 18755 / 11428.8 = 1.641

▪ We need to find a p-value. Because of the way our Ha is worded, we need to find P(t > 1.641) where the t distribution has n-1 = 40-1 = 39 degrees of freedom. We can do this by looking it up in a table or typing

✓ display ttail(39, 1.641)

▪ You should find that p = 0.0544.

▪ Or we could do the whole thing in STATA:

✓ ttest mri_count == 890000

You should get the following output:

. ttest mri_count ==890000

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

---------+--------------------------------------------------------------------

mri_co~t | 40 908755 11428.8 72282.05 885638.1 931871.9

------------------------------------------------------------------------------

Degrees of freedom: 39

Ho: mean(mri_count) = 890000

Ha: mean < 890000 Ha: mean != 890000 Ha: mean > 890000

t = 1.6410 t = 1.6410 t = 1.6410

P < t = 0.9456 P > |t| = 0.1088 P > t = 0.0544

To summarize:

One-tailed test

Ho: μ = 890,000 , Ha: μ > 890,000

t = 1.641, α = 0.10

P(t > 1.641) = 0.0544 = p

p < 0.10 so we reject Ho.

Conclusion: There is evidence that the mean brain size is larger than the normal population

[pic]

Example 2

Is the average brain size of the students less than 890,000 pixels? (Use an α = 0.10).

▪ From the question we know Ha: μ < 890,000. The null hypothesis could be Ho: μ = 890,000 or Ho: μ ≥ 890,000, depending on what we, as clinicians, think is more reasonable. The form of Ho does not alter our calculations, just the interpretation. In general, choose the Ho that seems the most conservative. Let’s say I don’t think that it is possible that the average could be more than 890,000, so I choose Ho: μ = 890,000.

▪ We do not know the population standard deviation σ, so if we were calculating the test by hand we would use the formula:

t = (Sample mean – μo )/(s/√n)

▪ We know: Sample mean = 908755, μo = 890000, s = 72282.05, n = 40. We “convert” our sample mean into a “t-score” (kind of like we did with the normal distribution): t = (908755-890000)/(72282.05/√40) = 18755 / 11428.8 = 1.641

▪ We need to find a p-value. Because of the way our Ha is worded, we need to find P(t < 1.641) where the t distribution has n-1 = 40-1 = 39 degrees of freedom. We can do this by looking it up in a table or typing

✓ display 1-ttail(39, 1.641)

▪ You should find that p = 1-0.0544 = 0.9456.

▪ Or we could do the whole thing in STATA:

✓ ttest mri_count == 890000

You should get the following output:

. ttest mri_count ==890000

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

---------+--------------------------------------------------------------------

mri_co~t | 40 908755 11428.8 72282.05 885638.1 931871.9

------------------------------------------------------------------------------

Degrees of freedom: 39

Ho: mean(mri_count) = 890000

Ha: mean < 890000 Ha: mean != 890000 Ha: mean > 890000

t = 1.6410 t = 1.6410 t = 1.6410

P < t = 0.9456 P > |t| = 0.1088 P > t = 0.0544

To summarize:

One-tailed test

Ho: μ = 890,000 , Ha: μ < 890,000

t = 1.641, α = 0.10

P(t < 1.641) = 0.9456 = p

p > 0.10 so we fail to reject Ho.

Conclusion: We keep our assumption that the mean brain size is same as the general population.

[pic]

Example 3

Is the average brain size of the students different to 890,000 pixels? (Use an α = 0.10).

▪ From the question we know Ha: μ ≠ 890,000 (another way of saying this is Ha: μ < 890,000 or μ > 890,000) . The null hypothesis is Ho: μ = 890,000 (there is no choice for this one).

▪ We do not know the population standard deviation σ, so if we were calculating the test by hand we would use the formula:

t = (Sample mean – μo )/(s/√n)

▪ We know: Sample mean = 908755, μo = 890000, s = 72282.05, n = 40. We “convert” our sample mean into a “t-score” (kind of like we did with the normal distribution): t = (908755-890000)/(72282.05/√40) = 18755 / 11428.8 = 1.641

▪ We need to find a p-value. Because of the way our Ha is worded, we need to find P(t < -1.641 or t > 1.641) where the t distribution has n-1 = 40-1 = 39 degrees of freedom. We can do this by looking it up in a table or typing

✓ display 2*ttail(39, 1.641)

▪ You should find that p = 2×0.0544 = 0.1088.

▪ Or we could do the whole thing in STATA:

✓ ttest mri_count == 890000

You should get the following output:

. ttest mri_count ==890000

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]

---------+--------------------------------------------------------------------

mri_co~t | 40 908755 11428.8 72282.05 885638.1 931871.9

------------------------------------------------------------------------------

Degrees of freedom: 39

Ho: mean(mri_count) = 890000

Ha: mean < 890000 Ha: mean != 890000 Ha: mean > 890000

t = 1.6410 t = 1.6410 t = 1.6410

P < t = 0.9456 P > |t| = 0.1088 P > t = 0.0544

To summarize:

One-tailed test

Ho: μ = 890,000 , Ha: μ < 890,000

t = 1.641, α = 0.10

P(t < -1.641 or t > 1.641) = 0.1088 = p

p > 0.10 so we fail to reject Ho.

Conclusion: We keep our assumption that the mean brain size is the same as for the general population.

[pic]

Now you will do a similar set of 3 tests for IQ (fsiq). Using one-sample t-tests with an α = 0.05, answer the following research questions with STATA:

Question 6: Is the mean IQ (using “fsiq”) significantly greater than 100? State your Ho, Ha, μo, t-value, p-value, α, whether you reject or fail to reject Ho, and your research conclusion. Choose an Ho that makes sense to you.

Question 7: Is the mean IQ significantly less than 100? State your Ho, Ha, μo, t-value, p-value, α, whether you reject or fail to reject Ho, and your research conclusion. Choose an Ho that makes sense to you.

Question 8: Is the mean IQ significantly different to 100? State your Ho, Ha, μo, t-value, p-value, α, whether you reject or fail to reject Ho, and your research conclusion.

The End.

Saving the Lab

At the end of the session, follow the following procedure so that you can save any files you may want to review later on (e.g. your log file). These are the instructions if you are saving your files onto a floppy disk. If you have a zip disk, just do the same steps but with the "Zip" folder on the Desktop rather than the "Floppy" folder.

✓ Type log close and your log file is automatically saved and closed. You can also go to File>Log>Close.

✓ Insert floppy disk (or zip disk).

✓ Right click on the "Floppy" icon on the Desktop and select "Mount". We can now save files onto this disk. If you do not “Mount” the disk, then your files may not save properly.

✓ Close your "Lab3" folder if it is open. Click on the "Lab3" icon on the Desktop and drag the whole folder to the floppy disk icon on your Desktop. You should get a small menu giving you a choice to "Move" or "Copy" the documents. Click on "Copy". Your files should now be on your floppy disk.

✓ Double click on the floppy disk icon to check that there is now a "Lab3" folder on your floppy disk.

✓ Now close the floppy disk window, and right click on the floppy disk icon and select "Unmount". You must do this in order to take your disk out of these machines and still have your files saved.

✓ Now press the button on your computer to eject the floppy disk.

It is very important to save a backup on the university computer in case something happens to the disk.

✓ Click on the “Lab3” folder icon and drag the whole folder to the “AFS” folder on your desktop. You should get a small menu giving you a choice to "Move" or "Copy" the documents. Click on "Copy". Your files are now stored on the University of Pittsburgh computer system and can be accessed from any computer with an internet connection. See the instructions below on how to access these documents from your home computer.

✓ You have finished -- see you for the next lab!

Accessing the files from home from the University of Pittsburgh computer system

Here are some instructions FYI to help you access your backup copy in case there is some problem with your floppy disk or zip, when you get out of here. To access your backup copies from your home or office computer do the following steps:

✓ Open Netscape Navigator or Internet Explorer. Type and go to this destination. (eg. Using my username, I would type ).

✓ After a few seconds, Internet Explorer will ask you for your username and password. Enter these and press return.

✓ After the screen has loaded, you should see a list of files and one of them should be your “Lab3”. Just drag and click that file to wherever you want to put it on your home computer. Close Internet Explorer.

Answer Sheet – Lab3 CLRES 2020 Summer 04.

NAME and DATE:

Question 1:

| |fsiq |mri_count |

|mean | | |

|sd | | |

Question 2:

Question 3:

Question 4:

|Variable Name |Discrete or Continuous? |

| | |

| | |

| | |

| | |

| | |

| | |

| | |

Question 5:

|Variable |Normal? (Yes/No) |Skewed Left/Right/Other Reason? |

| | | |

| | | |

| | | |

| | | |

| | | |

| | | |

| | | |

Question 6:

|Ho | |

|Ha | |

|μo = | |

|t = | |

|p = | |

| | |

|α = | |

|Reject Ho? | |

|Conclusion | |

| | |

| | |

| | |

| | |

Question 7:

|Ho | |

|Ha | |

|μo = | |

|t = | |

|p = | |

| | |

|α = | |

|Reject Ho? | |

|Conclusion | |

| | |

| | |

| | |

| | |

Question 8:

|Ho | |

|Ha | |

|μo = | |

|t = | |

|p = | |

| | |

|α = | |

|Reject Ho? | |

|Conclusion | |

| | |

| | |

| | |

| | |

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download