CSSS 508: Intro to R



CSSS 508: Intro to R

3/10/06

Review

Data: Nutritional and Marketing Information on US Cereals

The UScereal data come from the 1993 ASA Statistical Graphics Exposition.

The measurements are taken from the FDA food label and have been normalized to a portion of one American cup.

library(MASS)

attach(UScereal)

First we take a look at the structure of our data, assess the missingness, take a look at the type of variables we have, etc.

dim(UScereal)

[1] 65 11

help(UScereal)

mfr: Manufacturer (no order in categories)

G=General Mills, K=Kelloggs, N=Nabisco, P=Post, Q=Quaker Oats, R=Ralston Purina.

calories: number of calories in one portion

protein: grams of protein in one portion

fat: grams of fat in one portion

sodium: milligrams of sodium in one portion

fibre: grams of dietary fibre in one portion

carbo: grams of complex carbohydrates in one portion

sugars: grams of sugars in one portion

shelf: display shelf (1, 2, or 3, counting from the floor) (order in categories)

potassium: grams of potassium

vitamins: vitamins and minerals (none, enriched, or 100%) (order in categories)

summary(UScereal)

mfr calories protein fat sodium

G:22 Min. : 50.0 Min. : 0.7519 Min. :0.000 Min. : 0.0

K:21 1st Qu.:110.0 1st Qu.: 2.0000 1st Qu.:0.000 1st Qu.:180.0

N: 3 Median :134.3 Median : 3.0000 Median :1.000 Median :232.0

P: 9 Mean :149.4 Mean : 3.6837 Mean :1.423 Mean :237.8

Q: 5 3rd Qu.:179.1 3rd Qu.: 4.4776 3rd Qu.:2.000 3rd Qu.:290.0

R: 5 Max. :440.0 Max. :12.1212 Max. :9.091 Max. :787.9

fibre carbo sugars shelf

Min. : 0.000 Min. :10.53 Min. : 0.00 Min. :1.000

1st Qu.: 0.000 1st Qu.:15.00 1st Qu.: 4.00 1st Qu.:1.000

Median : 2.000 Median :18.67 Median :12.00 Median :2.000

Mean : 3.871 Mean :19.97 Mean :10.05 Mean :2.169

3rd Qu.: 4.478 3rd Qu.:22.39 3rd Qu.:14.00 3rd Qu.:3.000

Max. :30.303 Max. :68.00 Max. :20.90 Max. :3.000

potassium vitamins

Min. : 15.0 100% : 5

1st Qu.: 45.0 enriched:57

Median : 96.6 none : 3

Mean :159.1

3rd Qu.:220.0

Max. :969.7

Note that shelf, although it’s a numeric variable, would be better summarized by a table. There are only three location values; it can almost be viewed as categorical.

table(shelf)

shelf

1 2 3

18 18 29

We have no missing data.

We do have a wide variety of cereal compositions. Every cereal has some protein, carbohydrates, and potassium. Some cereals have no sodium; others have lots of sodium. Similarly for sugar, fiber, and fat.

Let’s compare the composition of the General Mills and Kellogg cereals.

par(mfrow=c(2,4))

gr.label ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download