Lab Objectives - Stanford University
Lab One: SAS Orientation
Also: 2x2 Tables, PROC FREQ, Odds Ratios, Risk Ratios
Lab Objectives
After today’s lab you should be able to:
1. Load SAS program.
2. Move between the EDITOR, LOG, and OUTPUT windows, and understand their different functions.
3. Understand SAS libraries. Understand SAS temporary library (the “WORK” library).
4. Use the Explorer Browser in SAS.
5. Understand how to write comments in SAS.
6. Understand the basic structure of a SAS program and SAS code.
7. Understand the difference between SAS datasteps and SAS procedures.
8. Use SAS as a calculator.
9. Know some SAS logical and mathematical operators.
10. Assign a library name (libname statement and point-and-click).
11. Input grouped data directly into SAS.
12. Use PROC FREQ to output contingency tables.
13. Use PROC FREQ to calculate chi-square statistics and odds ratios and risk ratios.
14. Understand the concept of a SAS macro (just a function).
15. If time, create a simple SAS macro to calculate the confidence intervals for an odds ratio.
LAB EXERCISE STEPS:
Follow along with the computer in front
1. Open SAS: From the desktop( double-click “Applications”( double-click SAS icon
2. There are 3 windows in SAS: the editor, output, and log windows.
a. You enter SAS code into the editor (the enhanced editor screen alerts you to potential errors through its coloring scheme). You run SAS programs that appear in the editor by clicking on the running man icon in your toolbar.
b. After a program runs, the output appears in the output screen.
c. The execution of a program is logged in the log screen, as are errors.*
You can open the editor, output, or log windows by selecting them in the “VIEW” menu at the top of your screen.
3. SAS programs are composed of data steps and procedures (abbreviated as PROCs). Data-steps deal with importing, entering, and manipulating data. Procedures deal with analyzing data (making numerical or graphical summaries and running specific statistical tests). We will first work with SAS datasteps:
Type the following data step in the editor window:
data example1;
x=18*10**-6;
run;
Explanation of code:
data example1;
x=18*10**-6;
run;
4. Select (highlight) the code (using your mouse), and click on the running man icon.
5. Use the Explorer Browser on the left hand side of your screen to locate and view the dataset “example1” in the work library (file cabinet icons represent data libraries).
a. Double click on the libraries icon (looks like a filing cabinet).
b. Double click on the work library icon (looks like one drawer in a filing cabinet).
c. Double click on the dataset “example1” to open it in viewtable mode. The dataset should contain a single value.
d. Click on the “up one level” icon (folder with an up-arrow on the toolbar) to return to the library icons.
6. Type the following code in the editor window, and run the program (select the code and click on running man).
data _null_;
x=18*10**-6;
put x;
run;
7. Check what has been entered into the log. Should look like:
5 data _null_;
6 x=18*10**-6;
7 put x;
8 run;
0.000018
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
8. Using your Explorer Browser, observe that no new datasets have been added to the work library.
9. Type the following code in the editor window and run the program.
data _null_; *use SAS as calculator;
x=LOG(EXP(-.5));
put x;
run;
SAS LOG should contain:
9 data _null_; *use SAS as calculator;
10 x=LOG(EXP(-.5));
11 put x;
12 run;
-0.5
10. Use SAS to calculate the probability that corresponds to the probability of getting X=25 from a binomial distribution with N=100 and p=0.5 (for example, what’s the probability of getting 25 heads EXACTLY in 100 coin tosses?):
data _null_;
p= pdf('binomial', 25,.5, 100);
put p;
run;
11. Use SAS to calculate the probability that corresponds to the probability of getting an X of 25 or more from a binomial distribution with N=100 and p=.5 (e.g., 25 or more heads in 100 coin tosses):
data _null_;
pval= 1-cdf('binomial', 24, .5, 100);
put pval;
run;
12. Libraries are references to places on your hard drive where datasets are stored. Datasets that you create in permanent libraries are saved in the folder to which the library refers. Datasets put in the WORK library disappear when you quit SAS (they are not saved).
13. Libraries are temporary references to places on your hard drive where datasets are stored. You can assign a library name through the libname statement (step 14) or through point-and-click features, as follows:
a. Click on “new library” icon (slamming file cabinet on the toolbar).
b. Browse to find the extension to the Desktop. COPY THIS EXTENSION USING CONTROL C.
c. Name the library hrp261.
d. Hit OK to exit and save.
14. Whenever you open SAS anew you will need to rename the library. If you have saved code to do this, it will save you a step. Type the following code in the editor (and run) to assign the folder Desktop the library name “hrp261”. USE CONTROL V to paste the extension (may differ on different computers).
libname hrp261 ‘C:\Documents and Settings\mitl-pc.LANE-LIB\Desktop’;
15. Type the following code in the editor to copy the dataset example1 into the hrp261 library (rename it “hrp261.example1”):
data hrp261.example1;
set example1;
x2=x**2;
drop x;
run;
16. Find the dataset in the hrp261 library using the Explorer Browser.
17. Browse to find the example1 dataset in the Desktop folder on your hard drive. This dataset will remain intact after you exit SAS.
18. Next, we will input data from a 2x2 table directly into a SAS dataset. In the SAS editor screen, input the following data set. These are grouped data from the atherosclerosis and depression example (from the Rotterdam study) in lecture 1:
data Rotterdam;
input IsDepressed HasBlockage Freq;
datalines;
1 1 28
1 0 53
0 1 511
0 0 1328
run;
/*Use PROC PRINT to view the data*/
proc print data=Rotterdam;
run;
19. Verify that the data have been printed to your output screen as below:
Is Has
Obs Depressed Blockage Freq
1 1 1 28
2 1 0 53
3 0 1 511
4 0 0 1328
20. Generate the 2x2 contingency table using PROC FREQ.
proc freq data=Rotterdam order=data;
tables IsDepressed*HasBlockage /nopercent norow nocol;
weight freq;
run;
RESULTS:
Table of IsDepressed by HasBlockage
IsDepressed
HasBlockage
Frequency‚ 1‚ 0‚ Total
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
1 ‚ 28 ‚ 53 ‚ 81
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
0 ‚ 511 ‚ 1328 ‚ 1839
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total 539 1381 1920
21. Request statistics for contingency tables using PROC FREQ.
proc freq data=Rotterdam order=data;
tables IsDepressed*HasBlockage / chisq measures expected;
weight freq;
run;
RESULTS:
Table of IsDepressed by HasBlockage
IsDepressed
HasBlockage
Frequency‚
Expected ‚
Percent ‚
Row Pct ‚
Col Pct ‚ 0‚ 1‚ Total
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
0 ‚ 1328 ‚ 511 ‚ 1839
‚ 1322.7 ‚ 516.26 ‚
‚ 69.17 ‚ 26.61 ‚ 95.78
‚ 72.21 ‚ 27.79 ‚
‚ 96.16 ‚ 94.81 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
1 ‚ 53 ‚ 28 ‚ 81
‚ 58.261 ‚ 22.739 ‚
‚ 2.76 ‚ 1.46 ‚ 4.22
‚ 65.43 ‚ 34.57 ‚
‚ 3.84 ‚ 5.19 ‚
ƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒˆ
Total 1381 539 1920
71.93 28.07 100.00
Statistic DF Value Prob
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Chi-Square 1 1.7668 0.1838
Likelihood Ratio Chi-Square 1 1.6976 0.1926
Continuity Adj. Chi-Square 1 1.4469 0.2290
Mantel-Haenszel Chi-Square 1 1.7659 0.1839
Phi Coefficient 0.0303
Contingency Coefficient 0.0303
Cramer's V 0.0303
Fisher's Exact Test
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
Cell (1,1) Frequency (F) 1328
Left-sided Pr = F 0.1157
Table Probability (P) 0.0407
Two-sided Pr = or ge, |+ addition |
| |- subtraction |
|INT(v)-returns the integer value (truncates) |SIGN(v)-returns the sign of the argument or 0 |
|ROUND(v)-rounds a value to the nearest round-off unit |SQRT(v)-calculates the square root |
|TRUNC(v)-truncates a numeric value to a specified length |EXP(v)-raises e (2.71828) to a specified power |
|ABS(v)-returns the absolute value |LOG(v)-calculates the natural logarithm (base e) |
|MOD(v)-calculates the remainder |LOG10(v)-calculates the common logarithm |
APPENDIX B: Some useful probability functions in SAS
Normal Distribution
➢ Cumulative distribution function of standard normal:
P(X≤Z)=probnorm(Z)
➢ Z value that corresponds to a given area of a standard normal (probit function):
Z= ((area)=probit(area)
➢ To generate random Z ( normal(seed)
Exponential
➢ Density function of exponential (():
P(X=k) = pdf('exponential', k, ()
➢ Cumulative distribution function of exponential (():
P(X≤k)= cdf('exponential', k, ()
➢ To generate random X (where (=1)( ranexp(seed)
Uniform
P(X=k) = pdf('uniform', k)
P(X≤k) = cdf('uniform', k)
To generate random X ( ranuni(seed)
Binomial
P(X=k) = pdf('binomial', k, p, N)
P(X≤k) = cdf('binomial', k, p, N)
To generate random X ( ranbin(seed, N, p)
Poisson
P(X=k) = pdf('poisson', k, ()
P(X≤k) = cdf('poisson', k, ()
-----------------------
This is a SAS data step.†桔楦獲⁴楬敮琍汥獬匠十琠牣慥整愠搠瑡獡瑥挠污敬斓慸灭敬⸱ₔ吠楨慤慴敳⁴楷汬戠汰捡摥椠瑮桴瞓牯鑫氠扩慲祲桷捩獩琠敨搠晥畡瑬琠浥潰慲祲氠扩慲祲ഠ匍浡獡愠潢敶戠瑵琠敨錠湟汵彬ₔ整汬䅓⁓潴渠瑯戠瑯敨潴洠歡慤慴敳⁴攨朮Ⱞ椠潹⁵番瑳眠湡⁴潴甠敳匠十愠慣捬汵瑡牯⸩潎整琠慨⁴慥档挠浯慭摮椠䅓⁓牰杯慲畭瑳戠異据畴瑡摥眠瑩敳業挭汯湯䴠獩汰捡摥漠業獳湩敳業挭汯湯慣獵慭祮攠牲牯湡畭档映畲瑳慲楴湯椠䅓ⱓ猠慰⁹瑡整瑮潩 The first line
tells SAS to create a dataset called “example1.” This dataset will be placed into the “work” library, which is the default temporary library.
Same as above but the “_null_” tells SAS to not bother to make a dataset (e.g., if you just want to use SAS as a calculator).
Note that each command in a SAS program must be punctuated with a semi-colon. Misplaced or missing semi-colons cause many errors and much frustration in SAS, so pay attention to their placement!
Assigns a value to the variable x.
Variable name goes to the left of the equals sign; value or expression goes to the right of the equals sign.
Note that each data step or proc in SAS ends with a run statement. The program is not actually executed, however, until you click on the running man icon.
Tells SAS to print the value of x in the SAS log.
Adds a new variable x-squared to the dataset.
Drops the variable x ; “keep x2;” would have same result.
Starts with the dataset work.example1
Makes a new dataset called example1 in the hrp261 library.
Code for moving a dataset, part of a dataset, or a dataset with modifications into a new library.
Name the library
Note use of informative variable names.
| | | |
| | | |
| |Depressed |Not |
|Atherosclerosis |28 |511 |
|None |53 |1328 |
Use SAS as a calculator. See Appendix for more mathematical and logical operators.
Don’t forget the semi-colon!
Location of the folder where the datasets are physically located.
[pic]
[pic]
Comments (ignored by SAS but critical for programmers and users) are bracketed by /* and */ and should appear green in the editor.
Comments (ignored by SAS but critical for programmers and users) may be bracketed by * and ;
Or by /* and */
Options (optional features) follow a front slash in a SAS procedure.
These options tell SAS to present the chi-square statistic as well as measures of association (odds ratios and risk ratios).
Asks SAS to present the expected table for the chi-square test.
See Appendix for more probability functions.
This is your first example of a SAS procedure.
The print procedure simply prints data in the output screen.
Column1 risk ratio=[pic]
Probability of having atherosclerosis if you are not depressed.
Column2 risk ratio=[pic]
Probability of also having atherosclerosis if you are depressed:
Chi-square is non-significant.
Probability of NOT having atherosclerosis if you are NOT depressed.
Expected counts are highlighted here.
Tells SAS how to ORDER the rows and columns. The default is to use numerical or alphabetical order, which would make cell a the “undepressed, unblocked” cell. Instead, order=data tells SAS to order rows and columns according to the order that the values appear in the dataset (1s before 0s).
Probability of NOT having atherosclerosis if you are depressed:
Fisher’s exact is automatically calculated when you request chi-square statistics for a 2x2 table.
No Atheroscl.
PREVIEW: We will later learn the use of PROC FORMAT to change 0’s and 1’s to meaningful labels.
depressed
If you forget the weight statement, SAS will see only 1 observation in each cell of your 2x2 table.
The variable “freq” stores the counts in each 2x2 cell.
Not depressed
Atheroscl.
The probit function returns the Z score associated with a given area under a normal curve.
When creating a macro, it’s important to include detailed comments that instruct a new user on how to use your macro.
Options (optional features) follow a front slash in a SAS procedure.
These options tell SAS to omit the cell, row, and column percents in the 2x2 table.
| | | |
| |Has outcome |No |
| | |outcome |
|Exposed |a |b |
|Unexposed |c |d |
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- stanford university philosophy department
- stanford university plato
- stanford university encyclopedia of philosophy
- stanford university philosophy encyclopedia
- stanford university philosophy
- stanford university ein number
- stanford university master computer science
- stanford university graduate programs
- stanford university computer science ms
- stanford university phd programs
- stanford university phd in education
- stanford university online doctoral programs