Integrating SAS and Excel: An overview and comparison of ...

[Pages:14]NESUG 2012

Foundations and Fundamentals

Integrating SAS and Excel: an Overview and Comparison of Three Methods for Using SAS? to Create and Access Data in Excel

Nathan Clausen, U.S. Bureau of Labor Statistics, Washington, DC Edmond Cheng, U.S. Bureau of Labor Statistics, Washington, DC

ABSTRACT

This paper looks at several ways integrating Excel with SAS. While SAS is powerful data manipulation and analysis tool, there are times when it may be better to view and interact with data in Excel. SAS easily facilitates this interaction in several ways. Only a basic knowledge of SAS is assumed, with references to more in-depth SAS resources provided.

First, we look at three methods for creating/consuming Excel data in SAS: PROC EXPORT, ODS with the ExcelXP tagset, and the Excel libname engine. For each method, we give a brief overview of usage and an outline of features that can be advantageous depending on the use case.

Second, we outline a practical scenario for implementing each of the interaction methods, with some basic examples of the resulting solution. These include creating custom reports, distributing SAS data to non-SAS users, creating interactive Excel analysis tools, and incorporating Excel inputs and logic in SAS programs.

Finally, we summarize the three methods using a feature comparison matrix to highlight the strengths and weaknesses of each method such as the ability to customize output, data integration, effort to create/maintain code, platform limitations, and technical considerations.

INTRODUCTION

SAS/BASE comes with a range of methods for working with data between SAS and Microsoft Excel. SAS users sometimes run into occasions when the data they are working with come and go in formats other than SAS datasets. These data might be ASCII files, some form of delimited text files, Microsoft files, XML, HTML, or variety types of proprietary database formats. Unless the customers have some knowledge of SAS, they most likely need their data in a format other than a SAS datasets. Or they might prefer the data be summarized in customized reports on a recurring basis. Although SAS is a powerful software package for running statistical procedures, performing routine data manipulation tasks, and handling large data, external customers will need a data format which they can easily access and perform independent analysis. For such cases, this paper focuses on a common format in most work environment: Microsoft Excel spreadsheets.

Many SAS user papers and SAS documents cover techniques for working with data beyond SAS data sets. This paper does not focus on one particular method, but rather compares at three different methods ? PROC IMPORT/EXPORT, ODS with ExcelXP tagset, and Excel LIBNAME. The first section provides an overview on each method. The next section demonstrates how the three methods are applicable in four common work scenarios. Users who are already familiar with working Excel and SAS can skip to the last section which shows a summary matrix on features and usages among the three methods.

OVERVIEW OF THE THREE METHODS

SAS software provides various methods in working with Microsoft Excel data. The following list identifies the common modes of interacting with Excel. The paper will go in details for the three methods that are highlighted.

?

PROC ACCESS, PROC IMPORT/EXPORT

?

ODS CSV, MSOFFICE2K, ExcelXP, HTML

?

SAS/ACCESS LIBNAME engine, SAS/ACCESS SQL Pass-Through facility, PROC DBLOAD

?

Dynamic Data Exchange (DDE)

METHOD 1 ? PROC EXPORT Description The EXPORT procedure is part of the SAS/ACCESS package. It reads a SAS data set and writes the data to an external file, including not only Excel files, but also ACCESS, delimited files (CSV), Lotus 1-2-3, and a number of other database formats.

1

NESUG 2012

Foundations and Fundamentals

Usage The EXPORT procedure consists of a single statement identifying the input data set, the output file, and some optional arguments. Options include whether or not to replace existing data when exporting the data (defaults to not replace), and whether or not to use variable labels or names (defaults to names). The EXPORT procedure has an option (DBMS) to specify the database identifier, identifying the format of the output file. For purposes of this analysis, the relevant values are either EXCEL or XLS. If you specify DBMS=EXCEL, the exported file will use either the current Excel file format (.xlsx) or the previous Excel file format (.xls) depending on the file name in the OUTFILE option.

The following code shows how to export the shoes data set to Excel, using variable labels as column headers:

PROC EXPORT DATA=togo DBMS=EXCEL OUTFILE='C:\path\t1.xlsx' REPLACE label;

RUN; It should be noted that while the DBMS=EXCEL setting can create either .xls or .xlsx files and the DBMS=XLS setting creates only .xls files, the XLS database identifier retains more formatting, such as dollar signs. It is also significantly faster.

Advantages and Limitations The EXPORT procedure is a quick and convenient method for exporting data from SAS into Excel files. It works both on Windows as well as UNIX environments, though only the XLS database identifier is supported with a standard installation.

METHOD 2 ? ODS EXCELXP TAGSET Description The ODS ExcelXP tagset leverages the SAS ODS to produce XML formatted output that can be opened and displayed in Excel. The customizability of the ODS tagset allows for a high level of precision in the layout and presentation of customized reports.

Usage To direct report output to the .xml file using the ODS ExcelXP tagset, simply open the ODS tagsets.excelxp destination and specify the output file name. There are a large number of options than can be specified in the ODS statement to control the output formatting. For example, the ,,sheet_name= option allows the user to specify the name of the spreadsheet in the workbook to which the output will be saved.

ods tagsets.excelxp file="C:\path\rpt.xml";

PROC PRINT... PROC REPORT...

ods tagsets.excelxp close;

Advantages and Limitations The ODS ExcelXP tagset is by far the most powerful of the options presented in this paper for generating customized reports in Excel. The ability to control layout, colors, and fonts as well as page orientation and print options gives the report designer a large range of tools for designing reports. And, because the output is XML, this method is platform independent. The XML file format does have some drawbacks, however. Code to create customized reports can quickly become quite complex, and the output is not compatible with further automated processing in Excel. (See )

METHOD 3 ? EXCEL LIBNAME ENGINE Description The Excel Libname Engine allows SAS to access Excel worksheets within an Excel file as if they were datasets within a SAS library.

Usage The Excel libname engine can be accessed by simply indicating an Excel file name in the physical path when assigning a library. The SAS log output from the libname statement indicates that the libname engine is EXCEL. For example:

LIBNAME ex "C:\path\L1.xlsx";

Advantages and Limitations Interacting with Excel through the EXCEL libname engine provides an easy way to read and create Excel data from SAS as if it were SAS data. However, it is not without some drawbacks. A minor drawback is that much formatting is lost as variables are converted either to numeric or general Excel for-

2

NESUG 2012

Foundations and Fundamentals

mats. A major drawback is that the engine does not allow for directly updating data in the Excel sheet. In order to change the content in an Excel file through the libname engine, a temporary dataset must first be created with the original and revised data values, the Excel table must then be deleted, and finally the revised data can be saved to the Excel libname.

FOUR PRACTICAL SCENARIOS

The typical interactions between SAS and Microsoft Excel are concentrated in three areas: data import/export, analysis, and reporting. Furthermore, the role of analysis and reporting could either be handled in SAS or Excel. This section presents examples of each interaction in work scenario. Each scenario will be solved by using one or more of the three methods ? PROC EXPORT, ExcelXP tagset, and Excel LIBNAME. With each methods unique functionalities and different limitations, this section shows how one might be more practical than others in various tasks.

The following examples use the PRSALE, PRSAL2, and PRSAL3 dataset available in SASHELP. The dataset contains two years of furniture sales by products and geographies. The multiple categorical dependent variables (country, region, division, prodtype, product), periodicity (year, quarter, month), and sales (actual, predicted) make this a good dataset for tabulation and reporting purposes.

SCENARIO 1 ? DISTRIBUTING DATA Lets begin with a simple, yet common, task of exporting data from SAS to Excel. In many work environments, users might not have access to the SAS software. This scenario addresses a situation where data needs to be exported from SAS as an Excel spreadsheet. Working with the SASHELP.PRSALE, PRSAL2, and PRSAL3 data sets, the task is to export every observation to multiple worksheets in one Excel file. The variable names stay the same and no special values formatting are necessary.

PROC EXPORT

In the PROC EXPORT procedure, the DATA parameter is set to the SAS data set name SASHELP.PRSALE, the DBMS engine is set to EXCEL, and the OUTFILE is set to the destination path and filename. Options such as LABEL, REPLACE, SHEET, and others could be used for additional, but limited, customization for the export.

PROC EXPORT DATA=sashelp.prdsale OUTFILE="C:\SAS\prdsale.xlsx" DBMS=EXCEL REPLACE;

SHEET="PRSALE"; RUN; PROC EXPORT DATA=sashelp.prdsal2

OUTFILE="C:\SAS\prdsale.xlsx" DBMS=EXCEL REPLACE; SHEET="PRDSAL2"; RUN; PROC EXPORT DATA=sashelp.prdsal3 OUTFILE="C:\SAS\prdsale.xlsx" DBMS=EXCEL REPLACE; SHEET="PRDSAL3";

RUN;

The procedure reads every variables and observations from the SAS data sets and writes to a native Excel spreadsheet. Including the REPLACE option overwrites existing Excel worksheet if the same SHEET name already exists in the workbook.

3

NESUG 2012

Foundations and Fundamentals

EXCELXP Tagset The ExcelXP Tagset method involves more coding compares to the two other methods. The method calls for ODS with ExcelXP tagset destination, and then uses a PROC PRINT procedure to insert each observation in the XML file. The ExcelXP tagset file needs to be loaded prior to the ODS statements in order for SAS to the XML markup language for outputs.

%INCLUDE 'C:\SAS\excltags.tpl';

ODS LISTING CLOSE; ODS tagsets.excelxp FILE="C:\SAS\prdsale_excelxp.xls"; ODS tagsets.excelxp options(sheet_name="PRDSALE");

PROC PRINT LABEL DATA=sashelp.prdsale NOOBS; RUN; ODS tagsets.excelxp options(sheet_name="PRDSAL2"); PROC PRINT LABEL DATA=sashelp.prdsal2 NOOBS; RUN; ODS tagsets.excelxp options(sheet_name="PRDSAL3"); PROC PRINT LABEL DATA=sashelp.prdsal3 NOOBS; RUN;

ODS tagSETs.excelxp CLOSE; ODS LISTING; Comparing to the PROC EXPORT procedure, the ODS ExcelXP tagset demands more CPU resources and real processing time. The output file took approximately 60 seconds and 33MB file size versus 5 seconds and 2MB with PROC EXPORT. In this example, the file is saved with the .XLS extension to make the XML file more recognizable for users to open in Microsoft Excel. Each time when the files opens, Excel will ask for a confirmation before opening the file.

Opening the prdsale_excelxp.xls spreadsheet, the three worksheets are formatted with SAS default style, as the code did not specify any style template in the ODS. The variable labels are used as column headers. The ACTUAL and PREDICT values are dollar formatted. The MONTH values are formatted to the abbreviated name of the month.

4

NESUG 2012

Foundations and Fundamentals

EXCEL LIBNAME Using the Excel LIBNAME engine for exporting data is much like the PROC EXPORT procedure. There are multiple ways of putting SAS data to Excel by setting an Excel workbook as a LIBNAME. This example assigns a new LIBNAME calls EXCELIB using the EXCEL engine directing to the Excel file path and workbook name. It uses the COPY function within the PROC DATASET procedure to copy PRDSALE, PRDSAL1, and PRDSAL2 data from SASHELP library to the Excel library. Lastly, the Excel LIBNAME statement is cleared, so the workbook is unlocked from SAS and accessible to other software.

LIBNAME excelib EXCEL "C:\SAS\prdsale_all.xlsx"; PROC COPY IN=sashelp OUT=excelib;

SELECT prdsale prdsal2 prdsal3; RUN; LIBNAME excelib CLEAR; Note: the Excel libname method does not allow overwriting existing data if the worksheet already contains data. This either requires re-creating the workbook, or deleting the content from the worksheet. The example below uses PROC DATASETS with DELETE function to remove the existing data. LIBNAME excelib EXCEL "C:\SAS\prdsale_all.xlsx"; PROC DATASETS LIB=excelib;

DELETE PRDSALE PRDSAL2 PRDSAL3; QUIT; LIBNAME excelib CLEAR; The exported spreadsheets look identical to the outputs from PROC EXPORT.

5

NESUG 2012

Foundations and Fundamentals

SCENARIO 2 ? CREATING CUSTOM REPORTS

Another common request of non-SAS users is displaying SAS results in customized reports. The second scenario shows how to use ODS ExcelXP tagset method to create highly customized Excel reports. SAS performs the tasks of data processing, running statistical models, formatting reports, and exporting the results in Excel format. Often times, SAS could be automated to generate these customized reports on a scheduled basis or in batches.

EXCEL Tagset

ODS with ExcelXP tagset destination has the ability of combining the flexibility of SAS ODS, built-in style templates, and customizable styling elements to generate visually pleasing outputs from SAS to Excel. One advantage of using ODS is that all formatting and layout are handled by SAS. The ExcelXP tagset includes the markup language to put SAS data in ODS appearance in XML format readable by Excel. The tagset offers a high degree of customization on style elements and attributes, in addition to controls over view and print settings in Excel.

Using the SASHELP.PRDALE, this example produces a summarized report on the sales of furniture by country. To illustrate the customization level available, the ODS uses the built-in BARRETTSBLUE style, formatted columns, computed totals, summary totals, and spanned headers. To make the report printer-friendly, the print options are set to fit the report to print in one page.

ODS LISTING CLOSE; ODS PDF FILE = "C:\SAS\report.pdf" NOTOC STYLE=STYLES.BARRETTSBLUE; ODS TAGSETS.EXCELXP

FILE="C:\SAS\report.xml" STYLE=BARRETTSBLUE OPTIONS(FROZEN_HEADERS='3' FITTOPAGE = 'YES'

DEFAULT_COLUMN_WIDTH='20,10,10,10,10,10,10' CENTER_HORIZONTAL = 'YES' CENTER_VERTICAL = 'YES' PRINT_HEADER = "Furnitures Sales Report 1993Q1 to 1994Q4" PRINT_FOOTER = "Printed on &sysdate");

PROC REPORT DATA=sashelp.prdsale NOWINDOWS CENTER;

COLUMN country product year quarter ('Sales Amount' actual predict diff);

DEFINE country / ORDER GROUP NOPRINT;

DEFINE product / ORDER GROUP;

DEFINE year / ORDER GROUP;

DEFINE quarter / ORDER GROUP;

DEFINE actual / SUM 'Actual'

FORMAT=DOLLAR10.;

DEFINE predict / SUM 'Predicted'

FORMAT=DOLLAR10.;

DEFINE diff / COMPUTED 'Difference' FORMAT=DOLLAR10.;

6

NESUG 2012

Foundations and Fundamentals

COMPUTE diff; diff = actual.SUM - predict.SUM;

ENDCOMP;

BREAK AFTER year / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE FONT_WEIGHT=BOLD}; BREAK AFTER product / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE FONT_WEIGHT=BOLD}; BREAK AFTER country / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE FONT_WEIGHT=BOLD};

BY country;

RUN; QUIT;

ODS TAGSETS.EXCELXP CLOSE; ODS PDF CLOSE; ODS LISTING;

Between the opening and closing of the ODS statements, the PROC REPORT procedure summarizes the actual and predicted sales for each product by years and quarters in report.xls workbook. A new column, DIFF, is calculated on based on the ACTUAL ? PREDICT sales. A summary total is computed after each year and each furniture product. A grand total is computed for each country after all the furniture products (not shown). Lastly, the workbook contains three worksheets with one for each country.

PDF Output

Excel Output

Whether a report is generated with PDF, HTML, Office2k, or ExcelXP tagset, or other destinations the output appearance and layouts are consistent among ODS outputs. To illustrate this example, the example codes include an ODS PDF destination statement which produces the same report in PDF format.

EXCEL LIBNAME The Excel LIBNAME method does not produce customized report within SAS. While any compatible SAS procedures can be directed to an Excel output thru Excel LIBNAME engine, but only the data values will be stored in the Excel file. Formatting or styles are lost when the file is saved to Excel. Using Excel LIBNAME method, the furniture sales report with the same PROC REPORT example are stored and displayed in default formatting after output to Excel.

7

NESUG 2012

Foundations and Fundamentals

LIBNAME excelib EXCEL "C:\SAS\report_excelib.xlsx";

PROC REPORT DATA=sashelp.prdsale NOWINDOWS CENTER OUT=excelib.prdsale;

COLUMN country product year quarter ('Sales Amount' actual predict diff);

DEFINE country / ORDER GROUP NOPRINT;

DEFINE product / ORDER GROUP;

DEFINE year / ORDER GROUP;

DEFINE quarter / ORDER GROUP;

DEFINE actual / SUM 'Actual'

FORMAT=DOLLAR10.;

DEFINE predict / SUM 'Predicted'

FORMAT=DOLLAR10.;

DEFINE diff / COMPUTEd 'Difference' FORMAT=DOLLAR10.;

COMPUTE diff; diff = actual.SUM - predict.SUM;

ENDCOMP;

BREAK AFTER year / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE FONT_WEIGHT=BOLD}; BREAK AFTER product / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE FONT_WEIGHT=BOLD}; BREAK AFTER country / SUMMARIZE skip ol STYLE={BACKGROUND=SKYBLUE FONT_WEIGHT=BOLD};

BY country;

RUN; QUIT;

LIBNAME excelib CLEAR; The furniture sales report with Excel LIBNAME looks like:

PROC EXPORT The functionality of PROC EXPORT procedure to export raw data does not make this a good method for producing customized report. SCENARIO 3 ? CREATING INTERACTIVE EXCEL ANALYSIS TOOLS This scenario looks at using interactive Excel analysis tools based on data updated from SAS. Most people have learned or are familiar with how to use Microsoft Excel. Even without formal training, Excels point-and-click fea-

8

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download