PROC TABULATE: Controlling Table Appearance

[Pages:6]PROC TABULATE: CONTROLLING TABLE APPEARANCE:

August V. Treff, NationsBank

ABSTRACT

PROC TABULATE provides one, two, and three dimensional tables. Tables may be constructed from class variables, analysis variables, and statistics keywords. Descriptive statistics can be displayed in table format with a minimum of code.

This paper will use a five-response survey to demonstrate how proc tabulate features translate to a printed report. We will examine table dimensions, crossing and concatenating variables, and percent calculations. Controlling table appearance through formats and customized column headings also will be discussed.

Many programmers do not use PROC TABULATE because they find that there is too much to learn about the procedure. PROC TABULATE has many more features than other procedures. We will examine the effect that these features have on the report display.

TABLE APPEARANCE

Tabular reports are defined by the table statement.

Table

page dimension definition, row dimension definition, column dimension definition;

Commas are used to separate dimensions. If two dimensions are specified, they are interpreted as row and column definitions. One dimension is interpreted as a column definition. An asterisk is used to nest variables within a definition while a space concatenates variables.

Our task is to prepare a tabular report of responses to a ten question survey. Responses have been entered into a sequential file. Elements include

GENDER (M F), AGE group (1 to 3), GROUP code (1 to 3), LOCATION, and RESP (responses A through E).

A blank in a field indicates no response while an asterisk indicates multiple responses. The item number, QUES, is determined from the position of the response.

We begin our examination of how to control the - 1 -

appearance of a table by displaying the values for questions, QUES, versus responses, RESP. The SAS? code is:

Proc Tabulate Data=SURVEY; Class QUES RESP; Table QUES, RESP;

Variables QUES and RESP must be included in a CLASS statement if they are to be used in a TABLE statement. QUES serves as the row dimension while RESP serves as the column dimension.

Our efforts are rewarded with a tabular display of question numbers crossed with responses.

---------------------------------------------

|

|

RESP

|

|-------------------------

|

|

A

|

B

|

|------------+------------

|

|

N

|

N

|------------------+------------+------------

|QUES

|

|

|------------------|

|

|1

|

167.00|

144.00

|------------------+------------+------------

|2

|

157.00|

144.00

The display of frequencies as two place decimals is not what is desired. The default format of BEST12. may be overridden with:

Proc Tabulate FORMAT=5.;

and we get:

-------------------------------------------

|

|

RESP

|

|-----------------------

|

| A | B | C | D

|

|-----+-----+-----+-----

|

| N | N | N | N

|------------------+-----+-----+-----+-----

|QUES

|

|

|

|

|------------------|

|

|

|

|1

| 167| 144| 281| 245

|------------------+-----+-----+-----+-----

|2

| 157| 144| 266| 263

In addition to suppressing the decimal places, we have reduced the width of each cell to five spaces.

Our next area for improving the appearance of the table is the size of the row title space. By default, this space is one-fourth of the LINESIZE= value. Its size can be changed with an option on the TABLE statement.

Table QUES, RESP / rts=6;

which gives us the following result:

-----------------------------

| |

RESP

| |-----------------------

| |A|B|C|D

| |-----+-----+-----+-----

| |N|N|N|N

|----+-----+-----+-----+-----

|QUES|

|

|

|

|----|

|

|

|

|1 | 167| 144| 281| 245

|----+-----+-----+-----+-----

|2 | 157| 144| 266| 263

The RTS= option changes the size of the row title space. The vertical lines count as spaces in RTS. This was not the case when we changed the cell width with FORMAT=.

The name of a statistic, such as N, may be relabeled with a KEYLABEL statement. Class variables may be relabeled with a LABEL statement.

Keylabel N='Count';

Label

QUES='Item'

RESP='Response';

The display becomes:

------------------------------------

| |

Response

|

| |-----------------------------|

| |A|B|C|D|E|

| |-----+-----+-----+-----+-----|

| |Count|Count|Count|Count|Count|

|----+-----+-----+-----+-----+-----|

|Item|

|

|

|

|

|

|----|

|

|

|

|

|

|1 | 167| 144| 281| 245| 142|

|----+-----+-----+-----+-----+-----|

|2 | 157| 144| 266| 263| 154|

Data may be summarized in a PROC SUMMARY step before being reported with Proc Tabulate. This uses fewer resources and results in a shorter run time. In the PROC TABULATE step, an analysis variable is needed along with the statistic SUM. The PROC SUMMARY step is as follows:

Proc Summary NWAY; Class QUES RESP; Output out=SURSUMM;

The default variable, _FREQ_, provides the frequency of each crossing of QUES and RESP. _FREQ_ is used as the analysis variable for PROC TABULATE. Our new invocation is:

Proc Tabulate data=SURSUMM format=6.;

Class QUES RESP;

Var

REQ_;

Keylabel sum='Count';

Label QUES='Item'

RESP='Response';

Table QUES,

RESP*SUM*_FREQ_ / rts=6;

and our report becomes:

--------------------------

| |

Resp

| |--------------------

| |* |A |B

| |------+------+------

| |Count |Count |Count

| |------+------+------

| |_FREQ_|_FREQ_|_FREQ_

|----+------+------+------

|Item|

|

|

|----|

|

|

|1 | 10| 166| 145

|----+------+------+------

|2 |

7| 157| 144

The word "Count" replaces "SUM" because of the keylabel statement. The analysis variable name _FREQ_ may be suppressed by adding the following:

RESP*SUM*_FREQ_=' ' / rts=6;

The display becomes:

--------------------------

| |

Resp

| |--------------------

| |* |A |B

| |------+------+------

| |Count |Count |Count

|----+------+------+------

|Item|

|

|

|----|

|

|

|1 | 10| 166| 145

|----+------+------+------

|2 |

7| 157| 144

This feature allows us to suppress the printing of a column definition variable or statistic. Adding _FREQ_=' ' to a label statement would suppress the word _FREQ_ but the blank cell would remain. Statistic names may also be suppressed in this manner.

Labels may be included between the single quotes. This allows greater control over labels than do the KEYLABEL and LABEL statements.

Total responses to an item may be displayed by adding the keyword "all" to the column dimension definition. The TABLE statement is:

Table QUES, All*SUM*_FREQ_=' ' RESP*SUM*_FREQ_=' ' / rts=6;

or

Table QUES, (All RESP)*SUM*_FREQ_=' ' / rts=6;

Either statement produces: -2 -

--------------------------

| |

|

| |

|-------------

| | ALL | * | A

|----+------+------+------

|Item|

|

|

|----|

|

|

|1 | 989| 10| 166

|----+------+------+------

|2 | 991|

7| 157

Variable ALL may be relabeled with the KEYLABEL statement or by using ='text' in the table statement. The table statement method has the advantage of allowing different labels for ALL in different contexts, as in:

Table QUES ALL='Column Total', (ALL='Row Total' RESP)* SUM*_FREQ_=' ';

ALL is separated from RESP by a space. This concatenates the sums for ALL and each value of RESP. The use of parentheses allows us to avoid repeating the code *SUM*_FREQ_=' '.

Sums for ALL differ because of values of missing for RESP. Missing values are suppressed unless the missing option is included in the PROC TABULATE statement.

Proc Tabulate data=SURSUMM format=6. missing;

The missing statement must also be included in any preceding PROC SUMMARY. Our display becomes:

---------------------------------

| |

|

| | Row |--------------------

| |Total |

|* |A

|----+------+------+------+------

|Item|

|

|

|

|----|

|

|

|

|1 | 1000| 11| 10| 166

|----+------+------+------+------

|2 | 1000|

9|

7| 157

More complicated tables may be produced using the asterisk to nest variables within a dimension definition. We may want to display the item numbers along with the responses along the side. Gender will be displayed across the top of the table. We will use only responses of A through E in our analysis by including the statement

If 'A' ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download