Class Note By Examples
Class Note by Examples
General information about SAS
SAS Institute web site:
SAS Online Document:
SAS Environment
The SAS environment mainly has three windows:
Program Editor
Log
Output
In normal case, if your computer has installed PC SAS, it is in Windows environment. If your PC SAS is connected to the Unix SAS by SAS / Connect, then you can also work on the Unix SAS Environment. Figure 2 in next page shows the PC SAS interface.
If your computer has installed SAS Enterprise Guide (SAS / EG), and it has local server connected to your PC SAS, then you are also working in the Windows environment. If your SAS / EG is connected to the Unix server by SAS / Connect, then you can also work in the Unix SAS environment. Figure 3 in next page shows the SAS Enterprise Guide interface.
For this SAS class, we are using SAS OnDemand for Academics. It is on a Unix server. So, we are working on the Unix SAS environment. Figure 1 illustrates the SAS environment.
In most cases, SAS program in Windows environment and in Unix environment is the same, except, the following two minor differences: First, the directory path in Windows environment uses back slash ‘\’, and in Unix environment uses forward slash ‘/’; Second, the directory name and file name in Windows environment is not case sensitive, but in Unix environment, it is case sensitive.
[pic]
Figure 2: PC SAS Interface:
[pic]
Figure 3: SAS Enterprise Guide (SAS / EG) Interface:
[pic]
[pic]
Getting start with the SAS program
* In the following code, the stock values are random numbers;
* They do not reflect past, current or future stock prices;
data stocks;
input ticker $ price industry $;
cards;
ATT 55.25 TECH
LU 48.8 TECH
MSFT 67.87 TECH
PFS 45.9 PHAR
CPQ 28.6 TECH
MRK 72.43 PHAR
AHP 67.29 PHAR
JPM 51.93 FINAN
C 69.72 FINAN
FBF 48.65 FINAN
AOL 38.72 TECH
CSCO 32.64 TECH
PVN 37.4 FINAN
BMS 57.21 PHAR
JNJ 61.23 PHAR
;
run;
proc print data=stocks;
run;
***************************************************************************;
data stocks2;
length ticker $8 price 8 Industry $8;
format price 5.2;
ticker='ATT' ; price=55.25 ; Industry='TECH' ; output;
ticker='LU' ; price=48.8 ; Industry='TECH' ; output;
ticker='MSFT'; price=67.87 ; Industry='TECH' ; output;
ticker='PFS' ; price=45.9 ; Industry='PHAR' ; output;
ticker='CPQ' ; price=28.6 ; Industry='TECH' ; output;
ticker='MRK' ; price=72.43 ; Industry='PHAR' ; output;
ticker='AHP' ; price=67.29 ; Industry='PHAR' ; output;
ticker='JPM' ; price=51.93 ; Industry='FINAN'; output;
ticker='C' ; price=69.72 ; Industry='FINAN'; output;
ticker='FBF' ; price=48.65 ; Industry='FINAN'; output;
ticker='AOL' ; price=38.72 ; Industry='TECH' ; output;
ticker='CSCO'; price=32.64 ; Industry='TECH' ; output;
ticker='PVN' ; price=37.4 ; Industry='FINAN'; output;
ticker='BMS' ; price=57.21 ; Industry='PHAR' ; output;
ticker='JNJ' ; price=61.23 ; Industry='PHAR '; output;
;
run;
proc print data=stocks2;
run;
***************************************************************************;
Libname dd '/courses/ddbf9765ba27fe300';
data work.stocks;
set dd.stocks;
run;
SAS Statement Syntax Review
There are different types of SAS statements.
Only used in data steps
Only used in proc steps
Used in anywhere (System options, libnames, macro statement etc)
General rules
Usually begin with identifying keywords (such as: data, set, length, run, etc.)
Always end with a semicolon (;)
Not case sensitive
Note: In most cases, text in quotes is case-sensitive.
In some operating system, directory path and filename is case sensitive
SAS statements are free format.
They can begin and end in any column. Therefore you can indent the program to make it easy to read.
One statement can continue in several lines.
Blanks can be used to separate words. Special characters also separate words.
Comments can be added in the program.
SAS step boundaries
The SAS step start with the following key words:
11. DATA statement
12. PROC statement
The end of a SAS step is identified by the following key words:
13. RUN statement (for DATA steps and most procedures)
14. DATA statement
15. PROC statement
16. QUIT statement (for some procedures).
Rules for SAS data set and variable names:
Must start with a letter (A to Z) or (a to z) or an underscore ( _ )
Can be uppercase, lower case, or mixed-case.
For version 8&9, it can be 1 to 32 characters in length
Can be mixed with letters, numbers and underscores
No space or tabs in the middle
Variable Value:
Character variable missing value is represented by a blank (‘ ‘).
Character variable’s value is 1- (2**15-1) (32767) characters. The default is 8 characters.
Numeric variable missing value is represented by a dot (.).
Length of numeric variable is 3-8 bytes floating point format. Default is 8bytes. Eight bytes of floating point storage can store number of 16 significant digits.
Sas data libraries
SAS accessing file directory (sas data library) through libref or loosely called libname
A libref references to a directory, not a file. In a directory, there may be a lot of SAS datasets.
SAS access external individual file by fileref, or loosely called filename. External files are not SAS datasets. They are text files, or Excel files, etc. One fileref or filename only reference to a single external file.
[pic]
SAS accessing file directory (sas data library) through libref. A libref is defined by the following statement:
LIBNAME libref ‘SAS-data-library’ ;
Or LIBNAME libref ‘full directory path’ ;
Example: LIBNAME proj1 'c:\project1';
Libname dd '/courses/ddbf9765ba27fe300';
Rules for naming a libref
• Must begin with a letter or underscore
• The remaining characters are letters, numbers, or underscore
• Must be 8 characters or less
The specific sas file (data set) is referred by two-level SAS filenames: Libref.filename
Example: dd.stocks reference the SAS dataset /courses/ddbf9765ba27fe300/stocks.sas7bdat
Note: the two-level SAS filenames (Libref.filename) only reference to SAS data name, not any other type of data, like text files.
Temporary libref “work”:
For every SAS session, the SAS system create a temporal directory (or library), and the libref “work” is automatically assigned by the system to refer to this directory. To refer to the SAS data in the work library, use: work.data_name or data_name. When the libref is omitted, the default libref is work.
The WORK library is automatically deleted when the SAS session is closed. All SAS data in the WORK library will be lost when the SAS session is closed.
Referencing to data files (non-SAS-data):
SAS access external individual file by fileref. A fileref is defined by the following statement:
FILENAME fileref ‘full directory path and filename’ ;
Example: FILENAME stock '/courses/ddbf9765ba27fe300/STOCK.TXT';
Rules for naming a fileref: The rules for fileref is the same as above rules for libref.
Note: On UNIX operating system, the directory and file name is case sensitive.
Styles of Input Data in to SAS Dataset
List Input:
data stocks;
input ticker $ price Industry $;
cards;
ATT 55.25 TECH
LU 48.8 TECH
MSFT 67.87 TECH
PFS 45.9 PHAR
;
run;
Column Input: Note: Key words cards and datalines are inter-changeable.
data stocks;
input ticker $ 1-6 price 10-18 Industry $ 20-28;
datalines;
ATT 55.25 TECH
LU 48.8 TECH
MSFT 67.87 TECH
PFS 45.9 PHAR
;
run;
Formatted input:
data cust;
input Name $ @9 birthday date7. @20 amount comma5.;
format birthday date7.;
cards;
John 12SEP83 2,234
Smith 23JAN92 1,345
Bob 03APR85 4,234
Steve 08AUG88 6,924
;
run;
Reading the External File
* Refer to the file by complete path and filename. ;
data stocks;
infile '/courses/ddbf9765ba27fe300/STOCK.TXT';
input ticker $ price very_long_name $;
run;
* Refer to the file by Fileref. ;
Fileref (mydata in the following example) is 1-8 characters long, start with letterA-Z and can have mixed letter, number and underscore.
filename bb '/courses/ddbf9765ba27fe300/bill.txt';
data bill;
infile bb FIRSTOBS=2;
input fname $1-13 lname $14-25 ssn1 ssn2 ssn3 areacd phonenum $
@68 bal1 dollar8. @77 duedt yymmdd10. @88 billdt date9.;
run;
The advantage of using fileref: ;
When you need to use the same file many times it is more convenient.
You can put all filerefs at the beginning of the program. It is easy to know which files are needed by the program, and easy to modify.
* Read delimited text file ;
filename dw '/courses/ddbf9765ba27fe300/dow_hist_comma.txt';
data dow_history;
informat date date9.;
format date date9.;
infile dw dlm=',';
input date open high low close volume adj_close ;
run;
Reading Raw Data Files in Depth
• List Input
In real world, most of the raw data are in text file format. So, we use FILEREF to refer the raw data.
filename mydata '/courses/ddbf9765ba27fe300/STOCK.TXT';
data stocks;
infile mydata;
input ticker $ price very_long_name $;
run;
* The following is what in stock.txt;
ATT 55.25 TECH
LU 48.8 TECH
MSFT 67.87 TECH
PFS 45.9 PHAR
CPQ 28.6 TECH
MRK 72.43 PHAR
AHP 67.29 PHAR
JPM 51.93 FINAN
C 69.72 FINAN
FBF 48.65 FINAN
AOL 38.72 TECH
CSCO 32.64 TECH
PVN 37.4 FINAN
BMS 57.21 PHAR
JNJ 61.23 PHAR
List input is simple, but there are several restrictions:
1. Field must be separated by at least one blank
2. Each field must be specified in order
3. Numeric missing value should be a period.
4. Character can not have missing value by blank, because it will cause miss-match between variable and value.
5. Character can not have blank in the middle
6. The default length for character variables is 8 bytes, i.e. 8 characters. Longer
value will be truncated.
7. Data must be in standard character or numeric format.
* Column Input;
If the data is in column, then column input will give more advantages.
filename mydata '/courses/ddbf9765ba27fe300/col_input.txt';
data col_input;
infile mydata;
input name $ 1-12 date $ 14-20 amount 23-26;
run;
The following is the data in col_input.txt:
/* Ruler
1 2
123456789012345678901234567890 */
John Dell 12SEP83 2234
Smith Gold 23JAN92 1345
Bob Chen 03APR85 4234
Steve Chang 08AUG88 6924
Michael West 25APR79 3414
Nancy Brown 17JUN85 4938
The advantages of using column input:
• Character variables can be up to 2**15-1 or 32767 characters in length. Not limited to the default length of 8 byres.
• Character variables can contain embedded blanks.
• Field can be read in any order.
• No place-holder is required for missing data.
• Part of the data can be omitted from the input record.
• Field or parts of fields can be reread.
• Formatted Input
Some data, without proper format it can not be read properly.
filename mydata '/courses/ddbf9765ba27fe300/fmt_input.txt';
data fmt_input;
infile mydata;
input name $ 1-12 @14 date date7. @23 amount dollar6.;
run;
* The following is the data in fmt_input.txt;
John Dell 12SEP83 $2,234
Smith Gold 23JAN92 $1,345
Bob Chen 03APR85 $4,234
Steve Chang 08AUG88 $6,924
Michael West 25APR79 $3,414
Nancy Brown 17JUN85 $4,938
The above program can also be written as the following:
filename mydata '/courses/ddbf9765ba27fe300/fmt_input.txt';
data fmt_input;
infile mydata;
input name $ 1-12 +1 date date7. +2 amount dollar6.;
run;
Features of formatted input:
• Formatted input reads data until it has read the number of columns indicated by
informat.
• There are two ways for controlling the position of the pointer.
• Can read data stored in nonstandard form.
• In format can be specified as needed.
Read from the Same Record Twice: Line-hold specifier @
data NYC;
input city $ 18-32 @;
if city='New York';
input FLNO 1-4 AirLine $6-18 city $ 18-32 Time $34-40;
cards;
9238 American New York 10:00
4235 United Philadelphia 5:00pm
798 Delta New York 8:50
4824 North West Houston 12:30pm
1639 South West Chicago 4:15pm
5417 North West New York 11:25
;
run;
Creating Multiple Observations from a Single Record: Double Trailing @
data Risk_Score;
input risk_level $ score @@;
cards;
H 580 L 800 M 680
M 690 L 780 H 620
H 610 M 685 L 795
;
run;
proc print data=Risk_Score;
Title 'Risk Level and Score';
run;
Important INFILE statement options:
LRECL=logical-record-length: Specifies the logical record length. Default is 256.
MISSOVER
Prevents an INPUT statement from reading a new input data record if it does not find values in the current input line for all the variables in the statement. When an INPUT statement reaches the end of the current input data record, variables without any values assigned are set to missing.
Use MISSOVER if the last field(s) may be missing and you want SAS to assign missing values to the corresponding variable.
TRUNCOVER
TRUNCOVER overrides the default behavior of the INPUT statement when an input data record is shorter than the INPUT statement expects. By default, the INPUT statement automatically reads the next input data record. TRUNCOVER enables you to read variable-length records when some records are shorter than the INPUT statement expects. Variables without any values assigned are set to missing.
Use TRUNCOVER to assign the contents of the input buffer to a variable when the field is shorter than expected.
EXPANDTABS
This option is needed when read tab separated file.
Example: LRECL=
* Read text file by line into sas dataset. SAS default record length is 256. To read text file with record length greater then 256, “lrecl” option has to be used;
* Without LRECL= option;
data temp;
infile '/courses/ddbf9765ba27fe300/LRECL400.txt';
input line $ 1-400 ;
run;
options ls=100;
data _null_;
set temp;
if _n_=1 then put line;
run;
* With LRECL= option;
data temp;
infile '/courses/ddbf9765ba27fe300/LRECL800.txt' lrecl=400;
input line $ 1-400 ;
run;
options ls=100;
data _null_;
set temp;
if _n_=1 then put line;
run;
Example: MISSOVER
filename mmm '/courses/ddbf9765ba27fe300/stocks_missover.txt';
* No missover option, only 13 rows read into the sas data.;
data stocks;
infile mmm;
input ticker $ price industry $;
run;
filename mov '/courses/ddbf9765ba27fe300/stocks_missover.txt';
* With missover option, all rows are read into the sas data.;
data stocks;
infile mov missover;
input ticker $ price industry $;
run;
Example: TRUNCOVER
* Read text file into sas dataset without truncover;
data temp;
infile '/courses/ddbf9765ba27fe300/Truncover_effect.txt';
input line $ 1-256 ;
run;
proc print data=temp;
run;
* Read text file into sas dataset with truncover;
data temp;
infile '/courses/ddbf9765ba27fe300/Truncover_effect.txt' truncover;
input line $ 1-256 ;
run;
proc print data=temp; run;
Note: If data is part of the program after the CARDS or DATALINES statement, the following infile statement can be used to use the infile options:
data tt;
infile cards truncover;
input line $1-100;
cards;
Test data:
This is a text data with variable length of fields for a demo.
To show truncover effect.
;
run;
proc print data=tt; run;
Example: EXPANDTABS
*** Without EXPANDTABS option, data is read wrong;
data stock;
infile '/courses/ddbf9765ba27fe300/ExpandTab.txt';
input ticker $ price industry $;
run;
*** With EXPANDTABS option, data is read correctly;
data stock;
infile '/courses/ddbf9765ba27fe300/ExpandTab.txt' EXPANDTABS;
input ticker $ price industry $;
run;
Output to text file:
Example:
libname mydat '/courses/ddbf9765ba27fe300';
filename fmt_out '/courses/ddbf9765ba27fe300/fmt_input.txt';
data _null_;
set mydat.fmt_input;
file fmt_out;
put @1 name
@16 date date9.
@28 amount dollar6.;
run;
Example: Using single @ to hold line.
libname mydat '/courses/ddbf9765ba27fe300';
filename phar '/courses/ddbf9765ba27fe300/phar.txt';
data _null_;
set mydat.stocks2;
file phar;
if industry='PHAR';
put @1 ticker @;
put @10 price dollar8.2 @;
put @20 industry;
run;
More details of using DATA _NULL_ / PUT method to generate report will be discussed later.
Read and Write Existing SAS Dataset
Sas data libraries
SAS accessing file directory (sas data library) through libref. A libref is defined by the following statement:
LIBNAME libref ‘SAS-data-library’ ;
Or LIBNAME libref ‘full directory path’ ;
Example: LIBNAME proj1 ‘c:\project1’;
Rules for naming a libref
• Must begin with a letter or underscore
• The remaining characters are letters, numbers, or underscore
• Must be 8 characters or less
The specific sas file (data set) is referred by two-level SAS filenames:
Libref.filename
Example: (new)
To refer to directory: c:\temp, use the following statement:
Libname cc ‘c:\temp’;
To refer to the sas dataset: /courses/ddbf9765ba27fe300/stockssas7bdat, use:
Libname dd '/courses/ddbf9765ba27fe300';
Libname temp '/home/yihong1/temp';
Data temp.stocks; * output data;
Set dd.stocks; * input data;
Run;
Temporary libref “work”:
For every SAS session, the SAS system create a temporal directory (or library), and the libref “work” is automatically assigned by the system to refer to this directory. To refer to the SAS data in the work library, use: work.data_name or data_name. When the libref is omitted, the default libref is work.
The WORK library is automatically deleted when the SAS session is closed. All SAS data in the WORK library will be lost when the SAS session is closed.
Advanced topic:
1. Usually, you don’t need to know the physical path of the work directory. But if you want to find out where the work directory is, use the following sas code:
options ls=120;
%let work=%sysfunc(getoption(work));
%put &work;
2. If your PC or Laptop is connected to the remote server, you can also assign libname to associate with a remote directory using the remote engine.
Example: libname proj1 remote ‘/users/myid/mydata’ server=servername;
End Advanced topic.
SAS data set name extension:
• For version 8 and above, file extension is: .sas7bdat for both Windows and Unix OPS.
Note: While refer to a SAS data set, the SAS data set extension should be omitted.
Examples:
To read the sas data: /courses/ddbf9765ba27fe300/stocks.sas7bdat into a work data work.stocks:
Libname cdat '/courses/ddbf9765ba27fe300';
data work.stocks; *** work. can be omitted;
*** Cannot use work.stocks.sas7bdat;
set cdat.stocks;
run;
To write the work.stocks to directory /courses/ddbf9765ba27fe300 and name it jk.sas7bdat:
Libname cdat '/courses/ddbf9765ba27fe300';
Data cdat.jk;
Set stocks;
Run;
To read the sas data: /courses/ddbf9765ba27fe300/stocks.sas7bdat and write the data to /home/yihong1/demo/stocks.sas7bdat:
Libname ind '/courses/ddbf9765ba27fe300';
Libname outd '/home/yihong1/demo';
Data outd.stocks;
Set ind.stocks;
Run;
Note: On UNIX system, the directory and filename are case sensitive.
To view all librefs that has been assigned in the sas session:
libname _all_ list;
New features in version 8 above in reading and writing data
• In SAS version 8 and later, SAS data can be accessed by directly using full path and filename without using SAS libref. The following is an example:
data stocks;
set '/courses/ddbf9765ba27fe300 /stocks';
run;
The Advantages of using libref
Though this method is available, it is not widely used. The following is the advantage of using libref.
When many SAS data in the same directory, using short libref is more convenient then using full path.
• All libref in a SAS program can be defined at the beginning of the program. It is easier to read, maintain and modify.
How to brows SAS data
• Using PROC print
Proc print data=cdat.stocks; run; Run this program, the data will be listed in the output window.
If the data is too large, the following program can be used to print only small part of the data:
Proc print data=cdat.stocks (obs=5); run; *** only print the first 5 observations;
Disadvantage: If the data has many variables, the output will be wrap around in the output window, sometimes it can be very messy and difficult to read.
• Using the SAS Explorer
Click on the Explorer tab on the bottom left of the SAS window
Then double click the Libraries icon
Then double click cdat library icon
Then double click stocks data set icon. At this time you will see the SAS data on a VIEWTABLE window with both horizontal and vertical scroll bars if the data is larger then the screen can show.
Note: You have to close this VOEWTABLE window before you can use this data again.
• Using ViewTable command on the top left command window
In the top left command window, type: vt cdat.stocks then
The SAS data will be shown in the VIEWTABLE window the same as using SAS Explore
• Using FSView command on the top left command window
In the top left command window, type: fsv cdat.stocks then
The SAS data will be shown in a FSVIEW window with both horizontal and vertical scroll bars if the data is larger then the screen can show.
SAS Data Set Terminology
SAS Data Set (----( SAS Table
Variable (----( Column
Observation (----( Row
To view the complete information about a SAS data set: Proc contents
Proc contents data=cdat.stocks;
Run;
• There are two parts in the SAS data set. The descriptor portion and the data portion. Proc contents give you the descriptor portion information of the data.
PROC CONTENTS Example:
Libname cdat '/courses/ddbf9765ba27fe300';
PROC CONTENTS DATA=CDAT.STOCKS;
RUN;
-----------------------
Figure 1:
Illustration of SAS Environment
Reading Data
Reading raw data, text file, or flat file
* Text data as part of the program;
Libname out ‘c:\temp’;
data stocks;
input ticker $ price Industry $;
cards; *** or : datalines;
ATT 55.25 TECH
LU 48.8 TECH
MSFT 67.87 TECH
;
run;
* Text data is in a text file;
Libname out ‘c:\temp’;
data out.stocks;
infile 'c:\sas_class\classdata\stock.txt';
input ticker $ price very_long_name $;
run;
Reading SAS Dataset: *.sas7bdat
Libname in ‘c:\temp’;
Libname out ‘c:\temp’;
* Data step;
Data out.outdata;
set in.indata;
Run;
* Proc Step;
Proc contents
data=in.indata;
Run;
Proc SQL;
Select count(*)
From in.indata;
Quit;
List Input Column Input Formatted Input
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- examples of online class introductions
- examples of class mission statements
- protection class lookup by address
- how to find protection class by address
- fire protection class by city
- search protection class by address
- class c motorhomes for sale by owner
- cost of first class mail by weight
- get element by class vba
- select by class jquery
- protection class codes by address
- fire protection class lookup by address