262-31: A Programmer's Introduction to the Graph Template ...

SUGI 31

Tutorials

Paper 262-31

A Programmer's Introduction to the Graphics Template Language

Jeff Cartier, SAS Institute Inc., Cary, NC

ABSTRACT In SAS 9.2, the ODS Graphics Template Language (GTL) becomes production software. This powerful language is used by many SAS/STAT? and SAS/ETS? procedures (including the new SAS/GRAPH? procedures SGPLOT and SGSCATTER) to produce graphical output. The Graphics Template Language can also be used in conjunction with special DATA step features to produce graphs independently. This presentation helps you understand the basics of the Graph Template Language and create graphs with the DATA step.

Topics include: ? Basic concepts: template definition and storage, compilation, and run-time actions ? Graphical Layouts: features of Gridded, Overlay, and Lattice layouts ? Common tasks: customizing axes and adding legends and titles ` ? Templates: making flexible templates by using dynamic and macro variables, conditional logic, and expressions ? Output: controlling image name, size, type, quality, and scaling ? Integration with ODS styles

INTRODUCTION By now you are probably familiar with the ODS graphics that are automatically produced by many SAS statistical procedures in SAS?9 (see SUGI 31 paper 192-31 by Robert Rodriguez). These procedures use compiled programs written in the Graphics Template Language (GTL) to produce their graphs. Such procedures have been designed so you do not need to know anything about GTL programming details to get this graphical output.

GTL can be used by SAS application developers and other programmers to create sophisticated analytical graphics independent of the statistical procedures. This presentation helps you understand this new technology and shows the code for several types of graphs you can produce. The focus is on the organization and potential of this new language.

TEMPLATE COMPILATION AND RUNTIME ACTIONS The DEFINE statement of PROC TEMPLATE allows you create specialized SAS files, called templates, that are used for controlling the appearance of ODS output. The template types you are most familiar with are STYLE, TABLE, and TAGSET. Starting in SAS 9.1, a new experimental template type was added. A STATGRAPH template describes the structure and appearance of a graph to be produced--similar to how a TABLE template describes the organization and content of a table. In SAS 9.2, the language that makes up the STATGRAPH template definition is production.

All templates are stored compiled programs. Here is the source program that produces a very simple STATGRAPH template named SCATTER:

proc template; define statgraph mygraphs.scatter; layout overlay; scatterplot x=height y=weight; endlayout; end;

run;

NOTE: STATGRAPH 'Mygraphs.Scatter' has been saved to: SASUSER.TEMPLAT

COMPILATION When the above code is submitted, the statement keywords and options are parsed, just as with any other procedure. If no syntax error is detected, an output template named SCATTER is created and stored in the MYGRAPHS item store (physically in SASUSER.TEMPLAT, by default). No graph is produced. It should be noted that STATGRAPH syntax requires that any required arguments be specified (X= and Y= for the SCATTERPLOT statement), but no checking for the existence of these variables is done at compile time (also notice that no reference to an input data set appears in the template).

1

SUGI 31

Tutorials

RUNTIME ACTIONS To produce a graph, a STATGRAPH template must be bound to an ODS data object at runtime and directed to an ODS destination. The simplest way to do this is with the new SAS 9.2 SAS/GRAPH procedure called SGRENDER, which was created to simplify the execution of user-defined templates such as SCATTER:

ods listing style=listing;

proc sgrender data=sashelp.class template="mygraphs.scatter";

run;

An ODS data object is constructed by comparing the template references to column names with variables that exist in the current data set. Here, there is a match for HEIGHT and WEIGHT so they are added to the data object and other variables are ignored. It is possible for a template to define new computed columns based on existing columns.

Once all the observations have been read, the data object and template definition are passed to a graph renderer that produces an image file for the graph which is then automatically integrated into the ODS destination. Rendering is totally independent of the legacy SAS/GRAPH GRSEG generation. In this example, a GIF image is created in the LISTING destination. The visual properties of the graph are determined by the ODS style in effect.

You should note that the SCATTER template is a very restrictive definition in that it can only create a plot of variables named HEIGHT and WEIGHT. As you will see later, STATGRAPH templates can be made more flexible by introducing dynamics or macro variables that allow variables and other information to be supplied at runtime.

GRAPHICAL LAYOUTS

One of most powerful features of the GTL is the syntax built around hierarchical statement blocks called layouts. A layout is a container that arranges its contents in cells. A cell may contain a plot, a title, a legend, or even another layout. The layout arranges the cells in a predefined manner--into a single cell with its contents superimposed or into rows or columns of cells. All STATGRAPH template definitions begin with a LAYOUT statement block.

Single-cell Layouts that Support Superimposition

Multi-cell Layouts in Rows and Columns of Cells

OVERLAY

2D plots, legends, text

GRIDDED

Basic grid of plots and text; all cells independent

OVERLAY3D

3D plots, text

LATTICE

Externalized axes, headers, sidebars

OVERLAYEQUATED 2D plots, legends, text, axes have DATALATTICE Data-driven number of cells;

equal sized units

1-2 classifiers

PROTOTYPE

Simplified OVERLAY used in DATAPANEL Data-driven number of cells;

DATAPANEL and DATALATTICE

n classifiers

THE OVERLAY LAYOUT The OVERLAY layout enables plots to be "stacked" in the order you declare them - with the first plot on the bottom of the stack. This layout manages the integration of plots based on different variables into single set of shared axes.

proc template; define statgraph mygraphs.scatteroverlay; layout overlay; ellipse x=height y=weight/ alpha=.01 type=predicted; scatterplot x=height y=weight; endlayout; end;

run;

proc sgrender data=sashelp.class template="mygraphs.scatteroverlay";

run;

In this case, the X and Y data ranges for prediction ellipse are larger than the data ranges for the scatter plot and the layout automatically adjusts the range for each axis.

2

SUGI 31

Tutorials

THE GRIDDED LAYOUT The GRIDDED layout creates a single- or multi-cell graph. It is primarily used to present multiple plots in a grid or to create a small table of text and statistics (inset) you want to embed in a graph. The cells are completely independent of one another. In this example, two OVERLAY layouts are nested in the GRIDDED layout to create a graph with two cells placed in one column.

proc template; define statgraph mygraphs.gridded; layout gridded / columns=1; layout overlay; ellipse x=height y=weight / clip=false alpha=.01 type=predicted; scatterplot x=height y=weight; entry "CLIP=FALSE" / autoalign=auto; endlayout; layout overlay; ellipse x=height y=weight / clip=true alpha=.01 type=predicted; scatterplot x=height y=weight; entry "CLIP=TRUE" autoalign=auto; endlayout endlayout; end;

run;

proc sgrender data=sashelp.class template="mygraphs.gridded";

run;

An ENTRY statement overlays text in each plot. The AUTOALIGN=AUTO option allows the software to judge where to place the text so as to avoid collision with the data being plotted. From this example, you can see that CLIP=TRUE option requests that the ellipse boundaries be ignored when scaling the axes.

THE LATTICE LAYOUT The LATTICE layout is a multi-cell layout with special features for combining data ranges of plots in columns or rows and externalizing axis information so it is not repeated in each cell. This layout and the OVERLAY are the most important and frequently used layouts.

proc template; define statgraph mygraphs.distribution; layout lattice / columns=1 rows=2 rowweight=(.9 .1) columndatarange=union rowgutter=2px; columnaxes; latticeaxis / label='Score' ; endcolumnaxes; layout overlay / yaxisopts=(offsetmin=.03); entrytitle 'Distribution of Scores'; histogram score / scale=percent; densityplot score / normal( ) ; fringeplot score; endlayout; boxplot y=score / orient=horizontal; endlayout; end;

run;

This lattice has one column and two rows. The row cells contain

? an overlay consisting of a histogram, normal distribution curve, and "fringe plot" showing the location of individual observations under the histogram bins.

? a boxplot showing the median (line), mean (diamond marker), interquartile range, and outliers.

The overlay is apportioned 90% of the available height and the boxplot 10%. The data ranges the X axes of the two cells are merged and LATTICEAXIS statement externalizes a single X axis and sets its label.

3

SUGI 31

Tutorials

THE DATAPANEL LAYOUT The DATAPANEL layout is a data-driven layout. It creates a grid of plots based on one or more classification variables and a graphical prototype. A separate instance of the prototype cell is created for each crossing of the classifiers. The data for each prototype is a subset of the all the data based on the current classification level(s).

proc template; define statgraph mygraphs.datapanel; layout gridded; entrytitle 'Sales of Office Furniture'; layout datapanel classvars=(product country)/ rows=3 order=rowmajor rowaxisopts=(griddisplay=on label=' ') columnaxisopts=(griddisplay=on label=' ');

layout prototype; seriesplot x=date y=actual;

endlayout;

endlayout; endlayout; end; run;

The number of unique values of the classification variables PRODUCT and COUNTRY determine the number of cells in the grid.

The cells are filled based on order in which the classifiers are declared, a grid dimension, and a wrapping order.

THE DATALATTICE LAYOUT The DATALATTICE layout is similar to DATAPANEL but it requires a row classifier, a column classifier, or both.

proc template; define statgraph mygraphs.datalattice; layout gridded; entrytitle 'Sales of Office Furniture'; layout datalattice columnvar=country rowvar=product/ headerlabeldisplay=value rowaxisopts=(griddisplay=on label='') columnaxisopts=(griddisplay=on label=''); layout prototype; seriesplot x=date y=actual; endlayout; endlayout; endlayout; end;

run;

The number of unique values of the row classifier PRODUCT determines the number of rows in the grid. The number of unique values of the column classifier COUNTRY determines the number of columns in the grid.

This layout allows the labels for the classification levels to appear outside or inside the grid.

4

SUGI 31

Tutorials

AXES 2D plots have four independent axes that can be used. By default, the X2 and Y2 axes are duplicates of X and Y axes and are not displayed unless requested. For 3D plots, there are the standard X, Y, and Z axes.

proc template; define statgraph mygraphs.axes2d;

layout overlay / xaxisopts= (label='X axis') yaxisopts= (label='Y axis') x2axisopts=(label='X2 axis') y2axisopts=(label='Y2 axis');

bandplot x=x limitupper=upper limitlower=lower / display=(fill);

seriesplot x=x y=y; endlayout; end; run;

proc template; define statgraph mygraphs.axes3d;

layout overlay3d / xaxisopts=(label="X axis") yaxisopts=(label="Y axis") zaxisopts=(label="Z axis") rotate=25 zoom=.6 tilt=45 cube=false;

surfaceplotparm x=x y=y z=z; endlayout; end; run;

If any data are explicitly mapped to the X2 or Y2 axes, these axes automatically displayed. In the example below, the first histogram's frequency counts are mapped to the Y2 axis and the percentages are mapped to the Y axis.

proc template; define statgraph mygraphs.y2axis; layout overlay; histogram weight / scale=count yaxis=y2; histogram weight / scale=percent yaxis=y; densityplot weight / normal(); endlayout;

end; run;

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download