Managing Tables in Microsoft SQL Server using SAS®

Managing Tables in Microsoft SQL Server using SAS?

Jason Chen, Kaiser Permanente, San Diego, CA Jon Javines, Kaiser Permanente, San Diego, CA Alan L Schepps, M.S., Kaiser Permanente, San Diego, CA Yuexin Cindy Chen, Kaiser Permanente, San Diego, CA

ABSTRACT

Getting SAS and Microsoft SQL Server to work together using ODBC connection has been well documented over the years. Interestingly, fewer publications have explored the option to use SAS/ACCESS interface to Microsoft SQL Server directly. In addition to the scarce resources on this topic, the syntax used in the SQL pass-through facility is different than the ones used in PROC SQL. This makes the task of managing tables in SQL Server difficult even for advanced SAS users. As organizations and their technologies evolving rapidly, there are also times that a SAS user would face a situation when an ODBC connection is not possible or performs below acceptable level.

The primary objective of this paper is to provide a straight forward presentation of basic operations to help SAS users manage tables located on a SQL Server without the need of establishing ODBC connection. A secondary objective is to provide a smooth learning curve for SAS users whom are new to SQL pass-through facility. The examples included will provide SAS syntax to create or delete a table in a SQL server, add or remove a column from a SQL table, and update a SQL table with a SAS dataset based on a primary key. Variables used in the examples include numeric integers, numeric values with decimals, text strings, and datetime variables. A comprehensive and convenient macro is provided under the "ADVANCED" section to help SAS users to easily load SAS datasets into SQL tables.

The authors' affiliated department has its SAS Server located in a UNIX environment. In addition to the use of SQL Server Management Studio (SSMS), SAS is employed to schedule batch jobs, to update tables in the SQL Servers, and to perform routine backup of SQL tables.

INTRODUCTION

Microsoft SQL Server is a popular choice of hosting large relational databases in many organizations, and SAS is often utilized for more in-depth data manipulation and analysis. In addition to the power for analytics, there are other advantages of using SAS, and this paper will focus on using SAS to manage tables in SQL Servers. Examples of basic data manipulation and updating SQL tables with SAS datasets will be provided.

Here is a list of basic operations covered in this paper:

1.

Create a table in SQL Server

2.

Add records to a table

3.

Delete a record from a table

4.

Add new columns to a table

5.

Update a SQL table using another SQL table

6.

Modify or delete columns from a table

7.

Make a copy of a table in SQL Server

8.

Delete a table from SQL Server

9.

Append a SQL table to another SQL table

10.

Read SQL tables into a SAS dataset

An advanced section near the end will demonstrate the steps needed to update a SQL table with a SAS dataset based on a unique primary key. Its corresponding ready-to-use SAS Macro is available in the "ADVANCED" section.

1

Managing Tables in Microsoft SQL Server using SAS, continued

SOFTWARE VERSION

SAS Software SAS/ACCESS SQL Server SQL Server Management Studio (SSMS) UNIX

Version 9.4 Version 9.4 Version 2012 Version 11.0.2100.60 AIX 7.1

BASICS

Here is an actual but simplified scenario that this paper will use to demonstrate concept: The SAS dataset WORK.SurgProc_SAS shown below is extracted from a Teradata Server. This information needs to be transferred to a table SurgProc_SQL in a SQL Server for storage. Routine update to this table using SAS is desired as newer data becomes available in Teradata.

/*Two records in the initial SAS data extracted from Teradata*/

Data WORK.SurgProc_SAS;

Length MRN Opdate OpStartTime OpEndTime BMI 8. PxCode $10 PxDescription $255;

format Opdate mmddyy10. OpStartTime OpEndTime time5. BMI 8.1;

Label

MRN

="Medical Record Number"

Opdate

="Operation Date"

OpStartTime ="Operation Start Time"

OpEndTime ="Operation End Time"

BMI

="Body Mass Index of the Patient"

PxCode

="ICD-9 procedure code"

PxDescription ="ICD-9 procedure code description";

MRN=9090909;

OpDate="01JAN2015"d; OpStartTime="13:22"t;

OpEndTime="15:01"t; BMI=35.2;

PxCode="";

PxDescription="";

Output;

MRN=5252525;

OpDate="16JAN2015"d; OpStartTime="10:00"t;

OpEndTime="11:12"t; BMI=28.9;

PxCode="81.54";

PxDescription="TOTAL KNEE REPLACEMENT";

Output;

Run;

0.

TEST SAS TO SQL CONNECTION

It is important to first check and establish a connection to the SQL Server using SAS:

%let dsrc="SQL_Database_name_here"; %let schema=XXXXXXX; %*Likely to be DBO, but other values are possible; %*If schema contains special characters, %*Use double-quote to enclose the schema: e.g. "CS\ABC"; %let UID=XXXXXXXXXX; %*User Name; %let PWD=XXXXXXXXXX; %*Password;

%*0. TEST SAS TO SQL CONNECTION; Libname SQL sqlsvr datasrc=&dsrc. Schema=&schema. User=&UID. Password=&PWD.;

If the connection was not successful, please run PROC SETINIT; and check if "SAS/ACCESS Interface to Microsoft SQL Server" is licensed, then work with the SQL Server administrator to troubleshoot.

Using the LIBNAME statement is known as the implicit pass-through (Werner, 2014), which is easy to code since it uses common SAS syntax with some restrictions. This paper, however, will focus on explicit pass-through since it is more powerful especially when joining large tables within the relational database. The syntax required for the explicit pass-through is Transact-SQL which is native to the SQL Server. This means a whole new door has been opened, because now users can use SAS to run features exclusive to another environment (Riley, 2008).

2

Managing Tables in Microsoft SQL Server using SAS, continued

1.

CREATE A TABLE IN SQL SERVER

Assume that a table SurgProc_SQL should be created in the SQL Server following the same data structure as the SAS dataset WORK.SurgProc_SAS, code like this can be used:

%*1. CREATE A TABLE IN SQL SERVER;

proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.);

execute (CREATE TABLE &schema..SurgProc_SQL

(

MRN

int,

OpDate

datetime,

OpStartTime

datetime,

OpEndTime

datetime,

BMI

float,

PxCode

varchar(10),

PxDescription

varchar(255)

CONSTRAINT

SurgProc_PK01

PRIMARY KEY (MRN)

)

) by mydb;

disconnect from mydb;

quit;

Please note the following:

? LIBNAME statement is not needed here. Explicit SQL pass-through facility uses "connect to SQLSVR" statement to establish a connection to the Microsoft SQL Server.

? For easy management of data type, it is advised to practice the following:

o SAS numeric variables without decimal can be saved as "int" (integer) in SQL unless its value can be more than 2147483647.

o SAS numeric variables with decimals can be saved as "float".

o SAS numeric date, or time variables are best converted into datetime value as text before loaded into SQL as "datetime" (See Section 2.1)

o SAS character variable can be saved as "varchar(variable length)"

o It is best practice to specify a primary key that is unique in the relational database. It can be done in SAS by providing the "CONSTRAINT".

2.

ADD RECORDS TO A TABLE

Here is a code to add the records from SAS dataset WORK.SurgProc_SAS to the SQL table SurgProc_SQL. It may seem longer than necessary, but it allows the user more control over the content that will be loaded.

The advantages of loading data this way are:

? Datetime columns are always handled correctly. No need to worry about SAS and SQL having different reference dates, or differences in handling of time variables.

? Decimals are loaded as text strings to obtain desired level of precision.

? NULL values and empty space are differentiated in character variables.

%*2. ADD RECORDS TO A TABLE;

%*2.1 Convert date and time fields to datetime value as text in SAS;

Data WORK.SurgProc_SAS_Staging (drop=Opdate_n OpStartTime_n OpEndTime_n);

set SurgProc_SAS

(rename=(Opdate=Opdate_n OpStartTime=OpStartTime_n OpEndTime=OpEndTime_n));

Opdate

= put(opdate_n,yymmdd10.)||" 00:00:00";

OpStartTime = put(opdate_n,yymmdd10.)||" "||put(OpStartTime_n,time.);

OpEndTime = put(opdate_n,yymmdd10.)||" "||put(OpEndTime_n,time.);

run;

3

Managing Tables in Microsoft SQL Server using SAS, continued

%*2.2 Macro to loop records to load them in one at a time;

%macro Add_Records(

From_SAS= /*Name of the SAS dataset*/,

To_SQL=

/*Name of the SQL table */);

Data _NULL_;

set &FROM_SAS.;

i=strip(put(_N_,8.));

call symput ("REC_cnt", strip(i));

run;

%DO i=1 %to &REC_cnt.; %*Manage the exact values to load into SQL;

PROC TRANSPOSE DATA=&FROM_SAS.(firstobs=&i. obs=&i.) OUT=_Trans_ (rename=(_NAME_=VarName COL1=VarValue)); VAR _ALL_; RUN;

DATA _Trans_; length VarValue $255; set _Trans_; %*Handle missing values with NULL; if strip(VarValue)="" or strip(VarValue)="." then VarValue="NULL"; else VarValue=catt("'", strip(VarValue), "'"); RUN;

%*Add the record into SQL; Proc Sql noprint;

Select strip(VarName) into :VarName separated by ", "

from _Trans_; Select strip(VarValue) into :VarValue

separated by ", " from _Trans_; quit;

proc sql; connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.); execute (insert into &schema..&To_SQL. (&varname.) values (&VarValue.) ) by mydb; disconnect from mydb; quit;

%end; %mend Add_Records; %Add_Records(From_SAS=WORK.SurgProc_SAS_Staging, To_SQL=SurgProc_SQL);

3.

DELETE A RECORD FROM A TABLE

If the record to delete can be identified by a column in the SQL table, please try this:

%*3. DELETE A RECORD FROM A TABLE; %*3.1 Delete a record based on a value in a column; proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.); execute ( DELETE

from &schema..SurgProc_SQL where MRN=9090909 ) by mydb; disconnect from mydb; quit;

4

Managing Tables in Microsoft SQL Server using SAS, continued

In reality however, it is more likely that the deletion is based on another table. Assuming that another SQL table TO_EXCLUDE contains MRNs that should be removed from table SurgProc_SQL, the following code can be used:

%*3.2 Create table TO_EXCLUDE for demonstration;

proc sql;

Connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.);

execute (CREATE TABLE &schema..TO_EXCLUDE

(MRN int)

) by mydb;

execute (insert into &schema..TO_EXCLUDE

(MRN) values (5252525)

) by mydb;

disconnect from mydb;

quit;

%*3.3 Delete a record based on value in another table; proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.); execute ( DELETE

from &schema..SurgProc_SQL where MRN in (select MRN from &schema..TO_EXCLUDE) ) by mydb; disconnect from mydb; quit;

4.

ADD NEW COLUMNS TO A TABLE

Assume that patient age and gender are new data elements needed in the SQL table, this code can be used to add them to the existing table SurgProc_SQL:

%*4. ADD NEW COLUMNS TO A TABLE;

%*4.1 If you have been following this instruction and have ran codes listed under

Section 3.1 and 3.3, table SurgProc_SQL is likely to be empty. Rerun this macro shown

in Section 2.2 to add the record back to table SurgProc_SQL;

%Add_Records(From_SAS=WORK.SurgProc_SAS_Staging, To_SQL=SurgProc_SQL);

%*4.2 Add new columns to a table;

proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.);

execute ( ALTER TABLE &schema..SurgProc_SQL

ADD Age

Int,

Gender

Varchar(1)

) by mydb;

disconnect from mydb;

quit;

5.

UPDATE A SQL TABLE USING ANOTHER SQL TABLE

Assume that age and gender are available by MRN in another SQL table SurgProc_Demog. This code can be used to update the table SurgProc_SQL by MRN.

%*5. UPDATE A SQL TABLE USING ANOTHER SQL TABLE;

%*5.1 Create table SurgProc_DEMOG for demonstration;

proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.);

execute (CREATE TABLE &schema..SurgProc_DEMOG

(

MRN

int,

Age

Int,

Gender

Varchar(1)

)

) by mydb;

execute (insert into &schema..SurgProc_DEMOG (MRN, Age, Gender)

values (5252525, 56, 'M'), (9090909, 67, 'F')

) by mydb;

disconnect from mydb;

quit;

5

Managing Tables in Microsoft SQL Server using SAS, continued

%*5.2 Update Age and Gender in table SurgProc_SQL using table SurgProc_DEMOG; proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.); execute (UPDATE &schema..SurgProc_SQL

SET &schema..SurgProc_SQL.Age = &schema..SurgProc_DEMOG.Age, &schema..SurgProc_SQL.Gender = &schema..SurgProc_DEMOG.Gender

FROM &schema..SurgProc_SQL LEFT JOIN &schema..SurgProc_DEMOG ON (&schema..SurgProc_SQL.MRN = &schema..SurgProc_DEMOG.MRN)

) by mydb; disconnect from mydb; quit;

6.

MODIFY OR DELETE COLUMNS IN A TABLE

Assume that patient age should be captured with decimals, but it is currently defined as an integer in the SQL table. Also, assume that the gender variable length of one is too short. This code below can change age's property to float, and increase the length of gender to six in table SurgProc_SQL:

%*6. MODIFY OR DELETE COLUMNS IN A TABLE;

%*6.1 Modify column property;

proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.);

execute ( ALTER TABLE &schema..SurgProc_SQL

ALTER COLUMN Age

float

ALTER TABLE &schema..SurgProc_SQL

ALTER COLUMN Gender Varchar(6)

) by mydb;

disconnect from mydb;

quit;

Here is the code in case patient age and gender are no longer needed in the SQL table; %*6.2 Delete columns from a table; proc sql; connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.); execute ( ALTER TABLE &schema..SurgProc_SQL DROP COLUMN Age, Gender ) by mydb; disconnect from mydb; quit;

7.

MAKE A COPY OF A TABLE IN SQL SERVER

Making a backup copy of a SQL table can be accomplished easily in SAS as follows:

%*7. MAKE A COPY OF A TABLE IN SQL SERVER; %*7.1 Make an exact backup copy; proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.); execute ( Select *

into &schema..SurgProc_SQL_BK from &schema..SurgProc_SQL ) by mydb; disconnect from mydb; quit;

6

Managing Tables in Microsoft SQL Server using SAS, continued

%*7.2 make a copy with structure only and no data; proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.); execute (Select *

into &schema..SurgProc_SQL_BK2 from &schema..SurgProc_SQL where 0=1 ) by mydb; disconnect from mydb; quit;

8.

DELETE A TABLE FROM SQL SERVER

Assuming that the backup copy is no longer needed, here is how to delete it:

%*8. DELETE A TABLE FROM SQL SERVER; proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.); execute (drop table &schema..SurgProc_SQL_BK

) by mydb; disconnect from mydb; quit;

9.

APPEND A SQL TABLE TO ANOTHER SQL TABLE

Assume that the table SurgProc_SQL_BK2 is created with no data (See Section 7.2), here is how records can be copied from the table SurgProc_SQL to the SurgProc_SQL_BK2:

%*9. APPEND A SQL TABLE TO ANOTHER SQL TABLE;

%let Varname=MRN, OpDate, OpStartTime, OpEndTime, BMI, PxCode, PxDescription; proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.); execute (insert into &schema..SurgProc_SQL_BK2 (&VarName.)

select &VarName. from &schema..SurgProc_SQL ) by mydb; disconnect from mydb; quit;

10. READ SQL TABLES INTO A SAS DATASET

Here is how two SQL tables can be joined and read into a SAS dataset using explicit pass-through. Please note that this method is a lot faster than using implicit pass-through via LIBNAME.

%*10. READ SQL TABLES INTO A SAS DATASET;

proc sql;

connect to SQLSVR as mydb (datasrc=&dsrc. user=&UID. password=&PWD.);

Create table WORK.SurgProc as

select *

from connection to mydb

(select S.MRN, S.Opdate, S.OpStartTime, S.OpEndTime,

D.Age, D.Gender, S.BMI, S.PxCode, S.PxDescription

from &schema..SurgProc_SQL

as S

left join &schema..SurgProc_DEMOG

as D on (S.MRN = D.MRN)

where S.Opdate > cast('01/15/2015' as datetime)

);

disconnect from mydb;

quit;

7

Managing Tables in Microsoft SQL Server using SAS, continued

ADVANCED

After the creation of SQL table SurgProc_SQL, assume that a month has elapsed and there are new records extracted by SAS from Teradata that need to be loaded into this table. Please notice that OpStartTime and OpEndTime are in datatime format already for easier data load into SQL Server.

%*ADVANCED: New data becomes available a month later: first record MRN=9090909 was

extracted again, but with some of the information corrected. Second record MRN=7373737

is brand new and need to add to the SQL table;

Data WORK.Procedure2;

MRN=9090909;

OpDate="01JAN2015"d;

OpStartTime="01JAN2015 09:12"dt; OpEndTime="01JAN2015 10:31"dt;

BMI=35.0;

PxCode="81.54";

PxDescription="TOTAL KNEE REPLACEMENT";

output;

MRN=7373737;

OpDate="24JAN2015"d;

OpStartTime="24JAN2015 07:44"dt; OpEndTime="24JAN2015 09:20"dt;

BMI=22.1;

PxCode="81.54";

PxDescription="TOTAL KNEE REPLACEMENT";

output;

format Opdate mmddyy10. OpStartTime OpEndTime datetime.;

Run;

Here is a list of tasks needed to accomplish this using the basics learned earlier:

? Make a backup copy of the table SurgProc_SQL and name it SurgProc_SQL_BK (See Section 6.1)

? Create a template table _tmp_ in SQL to host the new data. (See Section 1 or Section 6.2)

? Add the records from SAS dataset WORK.Procedure2 to SQL table _tmp_ after converting date and time. (See Section 2.1)

? Delete any record from table SurgProc_SQL table in SQL if the MRN (primary key) also exists in the newly loaded table _tmp_. (See Section 3.3)

? Append records from table _tmp_ to table SurgProc_SQL (See Section 9)

? Delete table _tmp_ since it is no longer needed. (See Section 8)

? View a list of new MRNs added to table SurgProc_SQL using table SurgProc_SQL_BK as reference. (See Section 10)

These tasks are all embedded in the macro below. Once macro parameters are provided, it should conveniently load a given SAS dataset into an assigned table in the Microsoft SQL Server. (NOTE: Please copy the macro below separately by page. This will avoid an error message such as "[Error] Failed to transcode data from U_UTF8_CD to U_LATIN1_CE...." or accidentally including the header and footer within this macro.)

%macro SAS_to_SQL( From=/*Name of the SAS dataset*/, TO= /*Name of the SQL table*/, Key= /*Name of the unique primary key in SQL table*/);

%*ENTER SQL Server configuration here; %*let dsrc="SQL_Database_name_here"; %*let schema=XXXXXXX; %*let UID=XXXXXXXXXX; %*let PWD=XXXXXXXXXX;

%*Establish a LIBNAME Connection for %sysfunc(exist()) functions (See Section 0); libname SQL sqlsvr datasrc=&dsrc. schema=&schema. user=&UID. password=&PWD.;

8

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download