TAKING ADVANTAGE OF THE PROC SQL PASS-THROUGH FACILITY

328 Host Systems and Environments

TAKING ADVANTAGE OF THE PROC SQL PASS-THROUGH FACILITY

Timothy Pruitt Corporate Cost Management Inc.

INTRODUCTION

The SQL procedure in SAS Release 6.07 provides a useful window with which a user can pass SQL statements directly to a database management system (DBMS) for processing. This window is the Pass-Through facility available with PROC SQL and SAS/ACCESS software under SAS 6.07.

The Pass-Through facility provides the tools needed to easily and quickly create, access and manipulate data found in the relational objects of a DBMS. Relational data structures can be accessed in their natural DBMS environment directly and efficiently and can be easily retrieved into SAS data structures, in the form of PROC SQL tables and views, for further processing.

The Pass-Through facility offers the ability to assign tasks to the DBMS, using SQL, or to SAS based on the strengths of each and the nature of the data.

This is a brief overview of some of the basic techniques and terminology associated with the Pass-Through facility and a gnide to getting started and taking advantage of the facility's tools.

CONCEPTSITERNITNOLOGY

It would be safe to assume that any user that wishes to use the Pass-Through facility already has a basic knowledge of SQL and relational data. However, a brief overview of some of the terms and concepts that are associated with relational data and are essential to the Pass-Through facility is provided.

Table - A group of rows that have the same columns. The rows of a table can be thought of as corresponding to the observations in a SAS data set. The columns therefore would be like the variables in a SAS data set. A row of a table is the smallest unit that can be deleted or inserted into a table.

View - A virtual table that has no physical storage. The view merely represents the data that is physically stored in tables. An example would be a view that provides a

subset of a table by limiting the rows of a table. In essence only the subset criteria that limits the rows of a table is defined as a view. When the view is requested, the criteria is applied to a table according to the view definition and the result appears as if the view was a physical table.

Index - An?ordered set of pointers to the data in a DBMS table.

These are some of the key PROC SQL statements that are used when communicating with a DBMS via the PassThrough facility.

CONNECT statement - Establishes a connection with a DBMS supported by the SAS/ACCESS software. The connection allows SQL statements to be sent to or received from a DBMS. The CONNECT statement is usually associated with a PROC SQL SELECT statement.

Note: The CONNECT statement is optional with some interfaces. In these cases a connection is automatically made once an EXECUTE or CONNECTION TO statement is issued.

DISCONNECT statement - Ends a connection with a DBMS. .

Note: If a DISCONNECT statement is not issued, the connection is automatically broken when PROC SQL ends.

CONNECTION TO statement - Enables queries of DBMS data to be used in PROC SQL queries and views.

EXECUTE statement - Allows DBMS SQL statements that are supported dynamically by the DBMS to be sent to the DBMS. Listed are some of the statements commonly passed with the EXECUTE statement.

DELETE - Removes rows from DBMS tables.

DROP - Removes DBMS views, tables, and indexes.

NESUG '93 Proceedings

Host Systems and Environments 329

INSERT - Adds rows to a DBMS table. CREATE TABLE - Creates a DBMS table. CREATE VIEW - Creates a DBMS view. CREATE INDEX - Creates an index on one or more columns of a DBMS table.

Example 1 PROC SQL; CONNECT TO DB2 (SSID=DB22); EXECUTE ( DELETE FROM TABLE) WHERE STATUS = 4 ) BY DB2; DISCONNECT FROM DB2;

QUIT;

In example 1 we are connecting to DB2 and specifying the DB2 subsystem DB22. Next an execute statement is issued to carry out a delete command in DB2 that removes all rows from the table TABLE 1 where the column called STATUS is equal to 4. The user is then disconnected from DB2. The QUIT command ends PROC SQL.

%PUT &sQLXMSG;

Example 2 starts off in typical fashion by invoking the SQL procedure and connecting to DB2 with the appropriate statements. Next we issue a PROC SQL query and specify the columns IDNUM and DEPTCODE to be l!elected from the DBMS. The FROM CONNECTION TO DB2 statement indicates that we are feeding the PROC SQL query with a DBMS query. The DBMS query follows as a SELECT statement with columns that correspond to the columns specified in the PROC SQL query. The DBMS query is accessing the DBMS table USER. EMPLOYEE as indicated by the FROM clause and the WHERE clause subsets the data.

The statement ? %PUT &SQLXMSG;' is a useful SQL debugging tool. It displays the SAS macro variable SQLXMSG containing a description of the return code generated by the DBMS indicating the source of any errors. Also available is the macro variable &SQLXRC containing the return code alone.

PUTTING THE PASS-THROUGH TO WORK

Creating an SQL Table from a DBMS Table

A good first step in using the Pass-Through facility is to simply access a DBMS table and use it to create a PROC SQL table. The PROC SQL table would then be available to SAS data steps and procedures as a SAS data set.

Example 2

PROC SQL; CONNECT TO DB2 (SSID=DB22); SELECT

IDNUM, DEPTCODE

FROM CONNECTION TO DB2 (SELECT

IDNUM, DEPTCODE FROM USER. EMPLOYEE

WHERE AGE < 65 );

Example 3

PROC SQL NOPRlNT; CONNECT TO DB2 (SSID=DB22); CREATE TABLE NEW. DATA AS

SELECT *

FROM CONNECTION TO DB2 (SELECT '"

FROM USER.PAYROLL WHERE STATUS = "ACTIVE" );

NESUG '93 Proceedings

330 Host Systems and Environments

%PUT &SQLXMSG;

DATA XYZ1.DATA; SET NEW.DATA; ....

In Example 3 we are connecting to DB2 to select all of the columns in the DBMS table USER.PAYROLL where the column STATUS is equal to the string" ACTIVE". The result set creates the PROC SQL table NEW.DATA. The table NEW.DATA can then be used in a SAS data step for further processing.

Notice that in example 3 the WHERE clause is issued in the DB2 query, inside the FROM CONNECTION TO DB2 clause, rather than in the PROC SQL query. The WHERE clause is valid in either place but by issuing it in DB2, performance is enhanced by limiting the number of rows "piped" through the Pass-Through window for SAS to process. Selecting rows with the WHERE clause is a job best handled up front by the DBMS to eliminate unnecessary rows before they get to SAS.

It is important to understand the difference between relational objects ( tables, views, indexes) that exist in PROC SQL and relational objects that exist in a DBMS. In example 3 the table USER.PAYROLL is a DB2 table that is accessed using the Pass-Through facility under PROC SQL. USER.PAYROLL does not exist in PROC SQL but is made available by the Pass-Through facility. The table NEW. DATA on the other hand only exists in PROC SQL.

A useful analogy is to think of the Pass-Through facility as a window between SAS and a DBMS allowing SAS data structures to be created from a DBMS. However, the flow of data through this window can go the other way as well. It is also possible to create DBMS data structures from SAS data.

Creating a DBMS Table from SAS

There are two steps needed to create a DBMS table using the Pass-Through facility. First, a DBMS table must be initially set up by issuing a CREATE Table command from withina Pass-Through facility EXECUTE statement. Next, the DBMS table is populated by issuing an INSERT command with column values from within a Pass-Through EXECUTE statement.

Example 4

PROC SQL; CONNECT TO DB2 (SSID=DB22); EXECUTE (CREATE TABLE BOOK_STOCK

( TITLE CHAR(18), AUTHOR CHAR(18),

STOCK_NO ? BY DB2;

%PUT &SQLXMSG;

EXECUTE (INSERT INTO BOOK_STOCK ( TITLE, AUTHOR, STOCK_NO)

VALUES( 'CHRISTINE', 'KING', 100001 ?

BY DB2;

EXECUTE (INSERT INTO BOOK_STOCK VALUES( 'TOMMYKNOCKERS', 'KING', 100002? BY DB2;

EXECUTE ( INSERT INTO BOOK_STOCK

VALUES( 'BORN FREE', 'ADAMSON', 100003 ?

BY DB2;

In example 4 a connection is made to the DB2 DBMS and a DB2 table called BOOK_STOCK is created with the columns TITLE, AUTHOR, and STOCK_NO. Notice that in the creation statement a character string format is specified for the TITLE and AUTHOR columns while the STOCK_NO column uses the default format to handle its numeric values. Once the table is created in the DBMS it can be loaded with data using the INSERT command. In the first INSERT statement, notice that the columns are specified before the VALUES statement but in the other two insertions the columns are not specified. This is because the DBMS automatically inserts columns in the order that they exist in the table when the columns are not specified in the INSERT statement.

Creating Views

Creating views is much like creating tables. As stated earlier, views are logical representations of tables and do not contain any physical data in the form of rows and

NESUG '93 Proceedings

Host Systems and Environments 331

columns. However when a view is requested, the result appears as if it were a table. Since there is no physical storage associated with a view, they are particularly useful when storage is a concern.

Example 5

PROC SQL NOPRINT; CONNECT TO DB2 (SSID=DB22);

CREATE VIEW RETIRED AS SELECT *

FROM CONNECTION TO DB2

(SELECT *

FROM BASE.EMPLOYEE

WHERE AGE > 65 OR

STATUS = "R" );

In example 5 the PROC SQL view called RETIRED is created using all of the columns in the DBMS table BASE.EMPLOYEE that meet the criteria specified in the WHERE clause. Notice that the view RETIRED is an SQL view, not a DBMS view. Once the PROC SQL view is defined, it can be accessed by SAS data steps and procedures.

Example 6

PROC SQL; CONNECT TO DB2 (SSID=DB22); EXECUTE( CREATE VIEW STOCK.HIEND93 AS

SELECT * FROM STOCK93.TEXTlLES

WHERE PRICE > 2500) BY DB2;

example 6 the DBMS view STOCK.HIEND93 is created by selecting all the columns from the DBMS table STOCK93.TEXTILES where the column PRICE is greater than 2500.

Using Colwnn Functions

Another powerful feature of the Pass-Through facility is the ability to do calculations against the columns of a DBMS table or view and store the results in the columns of a PROC SQL table or view. Calculations are performed in the DBMS environment by column functions ( also known as aggregate functions) and the results are passed as columns to PROC SQL. The column functions currently available in DB2 produce a single numeric value as a result and are defined as follows:

AVG COUNT MAX MIN SUM

The average of the values in the column.

The number of values in the column.

The largest value in the column.

The Smallest value in the column.

The sum of the values in the column.

The benefit of using column functions with the PassThrough facility is the ability to perform calculations and create a PROC SQL data structure in one step, eliminating the need for further processing in SAS data steps and procedures.

Views can also be created in a DBMS from within the EXECUTE statement of the Pass-Through facility. A DBMS view can only be created from DBMS tables or DBMS views, not from PROC SQL tables or views. In

NESUG '93 Proceedings

332 Host Systems and Environments

Example 7

PROC SQL NOPRINT; CONNECT TO DB2 (SSID=DB22); (CREATE TABLE SALES AS SELECT

EMP_ID, EMP_NAME, EXPRESSN AS NUMSALES, EXPRESSO AS TOTSALES FROM CONNECTION TO DB2 (SELECT

EMPLOYEE_NUM, EMPLOYEE_NAME, COUNT(*), SUM(SALE_AMOUNn FROM SALES.QUARTER2 WHERE LOCATION = "R2" GROUP BY EMPLOYEE_NUM);

In Example 7 a connection is made to DB2 to create the

PROC SQL table SALES from the DB2 table

SALES. QUARTER2.

Selected from

SALES.QUARTER2are the columns EMPLOYEE_NUM

and EMPLOYEE NAME as well as the results of the

column functions COUNT(*) and

SUM(SALE_AMOUNT). Since the GROUP BY clause

is used to group the rows according to the value of the

column EMPLOYEE_NUM, the column functions are

done for each group. The results are passed to PROC

SQL as the colunms EXPRESSN and EXPRESSO which

are saved in the table SALES as the columns

NUMSALES and TOTSALES respectively as indicated by

the alias clause AS.

Note: The results of column functions performed by a DBMS are passed to PROC SQL as the following column names in the order are listed:

EXPRESSN EXPRESSO EXPRESSl EXPRESS2 EXPRESS3 etc.

Exploiting The Link

Using the Pass-Through facility also offers a peculiar feature to exploit when creating PROC SQL tables. It is no secret to many SAS users that issuing the WHERE clause in SAS code while limiting observations with the OBS = option is forbidden. Likewise in a DBMS the only way to restrict the rows selected by an SQL query is with a WHERE clause. However, the Pass-Through facility offers the ability to use the WHERE clause and the OBS= option at the same time.

Example 9

OPTIONS OBS=50;

PROC SQL NOPRINT; CONNECT TO DB2 (SSID=DB22); CREATE TABLE RESULTl AS SELECT *

FROM CONNECTION TO DB2 (SELECT * FROM PROCESS.JULY

WHERE DISTRICT = 51);

%PUT &SQLXMSG;

In Example 9 the PROC SQL table RESULTl is created by selecting all of the columns from the DB2 table PROCESS.JULY. The WHERE clause is used to select only the rows that have the value 51 in the column DISTRICT. Notice that the SAS option OBS=50 is in effect and therefore the table RESULT will contain a maximum of 50 rows.

To understand how the WHERE clause can exist with the OBS= option, again it is important to know what exists in SAS and what exists in the DBMS and which processes are being carried out by each. In Example 9 the WHERE clause is being applied by DB2, invisible to SAS, and the result is being passed to SAS where the OBS= option is limiting the number of rows/observations in the SAS table/data set RESULT!. The net effect is that both methods of selecting rows are being used simultaneously to perform a task that neither SAS nor DB2 could do easily on their own.

NFSUG '93 Proceedings

CONCLUSION

In conclusion, the Pass-Through facility provides the tools necessary to access and manipulate DBMS relational data from within SAS. The Pass-Through facility is a powerful window between SAS and a DBMS that promotes the quick and easy flow of data to and from a DBMS. With a basic knowledge of SQL and a few PassThrough commands, a SAS user can quickly and efficiently bridge the gap between SAS and the relational objects of a DBMS.

SAS and SASIACCESS are registered trademarks of SAS Institute Inc., Cary, NC.

DB2 is a registered trademark of International Business Machines Corporation.

REFERENCES

1. SAS Technical Report P-221. SAS/ACCESS Software: Changes and Enhancements. Release 6.07, SAS Institute Inc., Cary, NC.

2. SAS Guide to the SOL Procedure. Usage and

Reference. Version 6. First Edition, SAS Institute Inc., Cary, NC.

3. SAS/ACCESS Interface to DB2. Usage and Reference. Version 6. First Edition, SAS Institute Inc., Cary, NC.

4. Renade, J., Sehgal, M., Elkind, P., & Grossman, J. (1991). DB2 - Concepts. Programming. and Design. New York: McGraw Hill, Inc.

5. Date, C. J., White, C. J. (1988). A Guide to DB2 Second Edition. New York: Addison - Wesley.

CONTACT INFORMATION

Timothy Pruitt Corporate Cost Management Inc. 1300 Piccard Drive Rockville, MD 20850

Host Systems and Environments 333 NFSUG '93 Proceedings

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download