4 Solutions to Exercises

50

Solutions to Exercises

4 Solutions to Exercises

4.1 About these solutions

The solutions that follow were prepared by Darryl K. Nester. I occasionally pillaged or

plagiarized solutions from the second edition (prepared by George McCabe), but I take full

responsibility for any errors that may remain. Should you discover any errors or have any

comments about these solutions (or the odd answers, in the back of the text), please report

them to me:

Darryl Nester

Bluffton College

Bluffton, Ohio 45817

email: nesterd@bluffton.edu

WWW:

4.2 Using the table of random digits

Grading SRSs chosen from the table of random digits is complicated by the fact that students

can ?nd some creative ways to (mis)use the table. Some approaches are not mistakes, but

may lead to different students having different ?right? answers. Correct answers will vary

based on:

? The line in the table on which they begin (you may want to specify one if the text does

not).

? Whether they start with, e.g., 00 or 01.

? Whether or not they assign multiple labels to each unit.

? Whether they assign labels across the rows or down the columns (nearly all lists in the

text are alphabetized down the columns).

Some approaches can potentially lead to wrong answers. Mistakes to watch out for include:

? They may forget that all labels must be the same length, e.g., assigning labels like

0, 1, 2, . . . , 9, 10, . . . rather than 00, 01, 02, . . ..

? In assigning multiple labels, they may not give the same number of labels to all units.

E.g., if there are 30 units, they may try to use up all the two-digit numbers, thus assigning

4 labels to the ?rst ten units and only 3 to the remaining twenty.

4.3 Using statistical software

The use of computer software or a calculator is a must for all but the most cursory treatment

of the material in this text. Be aware of the following considerations:

Acknowledgments

51

? Standard deviations: Students may easily get confused by software which gives both

the so-called ?sample standard deviation? (the one used in the text) and the ?population

standard deviation? (dividing by n rather than n ? 1). Symbolically, the former is

usually given as ?s? and the latter as ?¦Ò ? (sigma), but the distinction is not always clear.

For example, many computer spreadsheets have a command such as ?STDEV(. . . )? to

compute a standard deviation, but you may need to check the manual to ?nd out which

kind it is.

.

As a quick check: for the numbers 1, 2, 3, s = 1 while ¦Ò = 0.8165. In general, if

two values are given, the larger one is s and the smaller

q is ¦Ò . If only one value is given,

n

and it is the ?wrong? one, use the relationship s = ¦Ò n?1

.

? Quartiles and ?ve-number summaries: Methods of computing quartiles vary between

different packages. Some use the approach given in the text (that is, Q 1 is the median

of all the numbers below the location of the overall median, etc.), while others use a

more complicated approach. For the numbers 1, 2, 3, 4, for example, we would have

Q 1 = 1.5 and Q 3 = 2.5, but Minitab reports these as 1.25 and 2.75, respectively.

Since I used Minitab for most of the analysis in these solutions, this was sometimes

a problem. However, I remedied the situation by writing a Minitab macro to compute

quartiles the IPS way. (In effect, I was ?dumbing down? Minitab, since its method is

more sophisticated.) This and other macros are available at my website.

? Boxplots: Some programs which draw boxplots use the convention that the ?whiskers?

extend to the lower and upper deciles (the 10th and 90th percentiles) rather than to the

minimum and maximum. (DeltaGraph, which I used for most of the graphs in these

solutions, is one such program. It took some trickery on my part to convince it to make

them as I wanted them.)

While the decile method is merely different from that given in the text, some methods

are (in my opinion) just plain wrong. Some graphing calculators from Sharp draw ?box

charts,? which have a center line at the mean (not the median), and a box extending from

x ? ¦Ò to x + ¦Ò ! I know of no statistics text that uses that method.

4.4 Acknowledgments

I should mention the software I used in putting these solutions together:

? For typesetting: TEX ? speci?cally, Textures, from Blue Sky Software.

? For the graphs: DeltaGraph (SPSS), Adobe Illustrator, and PSMathGraphs II (MaryAnn

Software).

? For statistical analysis: Minitab, G?Power, JMP IN, and GLMStat?the latter two

mostly for the Chapters 14 and 15. George McCabe supplied output from SAS for

Chapter 15. G?Power is available as freeware on the Internet, while GLMStat is shareware. Additionally, I used the TI-82, TI-85, TI-86, and TI-92 calculators from Texas

Instruments.

52

Chapter 1

Looking at Data ? Distributions

Chapter 1 Solutions

Section 1:

Displaying Distributions with Graphs

1.1 (a) Categorical. (b) Quantitative. (c) Categorical. (d) Categorical. (e) Quantitative.

(f) Quantitative.

1.2 Gender: categorical. Age: quantitative. Household income: quantitative. Voting

Democratic/Republican: categorical.

1.3 The individuals are vehicles (or ?cars?). Variables: vehicle type (categorical), where

made (categorical), city MPG (quantitative), and highway MPG (quantitative).

1.4 Possible answers (unit; instrument):

? number of pages (pages; eyes)

? number of chapters (chapters; eyes)

? number of words (words; eyes [likely bloodshot after all that counting])

? weight or mass (pounds/ounces or kilograms; scale or balance)

? height and/or width and/or thickness (inches or centimeters; ruler or measuring tape)

? volume (cubic inches or cubic centimeters; ruler or measuring tape [and a calculator])

Any one of the ?rst three could be used to estimate the time required to read the book;

the last two would help determine how well the book would ?t into a book bag.

1.5 A tape measure (the measuring instrument) can be used to measure (in units of inches

or centimeters) various lengths such as the longest single hair, length of hair on sides or

back or front. Details on how to measure should be given. The case of a bald (or balding)

person would make an interesting class discussion.

1.6 Possible answers (reasons should be given): unemployment rate, average (mean or

median) income, quality/availability of public transportation, number of entertainment

and cultural events, housing costs, crime statistics, population, population density, number

of automobiles, various measures of air quality, commuting times (or other measures of

traf?c), parking availability, taxes, quality of schools.

1.7 For (a), the number of deaths would tend to rise with the increasing population, even if

cancer treatments become more effective over time: Since there are more people, there

are more potential cases of cancer. Even if treatment is more effective, the increasing

cure rate may not be suf?cient to overcome the rising number of cases.

For (b), if treatments for other diseases are also improving, people who might have

died from other causes would instead live long enough to succumb to cancer.

Solutions

53

Even if treatments were becoming less effective, many forms of cancer are detected

earlier as better tests are developed. In measuring ?ve-year survival rates for (c), if we

can detect cancer (say) one year earlier than was previously possible, then effectively,

each patient lives one year longer after the cancer is detected, thus raising the ?ve-year

survival rate.

.

.

949

903

1.8 (a) 1988: 24,800,000

= 0.00003827 = 38.27 deaths per million riders. 1992: 54,632,000

=

0.00001653 = 16.53 deaths per million riders. Death rates are less than half what they

were; bicycle riding is safer. (b) It seems unlikely that the number of riders more than

doubled in a six-year period.

1.9 Using the proportion or percentage of repairs, Brand A is more reliable:

22% for Brand A, and

192

480

2942

13,376

.

= 0.22 =

= 0.4 = 40% for Brand B.

1.10 (a) Student preferences may vary; be sure they give a reason. Method 1 is faster, but

less accurate?it will only give values that are multiples of 10. (b) In either method 1 or

2, fractions of a beat will be lost?for example, we cannot observe 7.3 beats in 6 seconds,

only 7. The formula 60 ¡Á 50 ¡Â t, where t is the time needed for 50 beats, would give a

more accurate rate since the inaccuracy is limited to the error in measuring t (which can

be measured to the nearest second, or perhaps even more accurately).

1.11 Possible answers are total pro?ts, number of employees, total value of stock, and total

assets.

9000

Number of students (thousands)

1.12 (a) Yes: The sum of the ethnic

group counts is 12,261,000. (b) A

bar graph or pie chart (not recommended) may be used. In order

to see the contrast of the heights

of the bars, the chart needs to be

fairly tall.

8000

7000

6000

5000

4000

3000

2000

1000

0

Am

As

ian

Ind erica

ian n

no

no

His

Fo

n-H

n

rei

pa

gn

nic w -Hisp

Bla ispa

h

a

n

ck ic

ite nic

54

Chapter 1

1.13 (a) Shown at right. The bars are

given in the same order as the data in

the table?the most obvious way?but

that is not necessary (since the variable is nominal, not ordinal). (b) A

pie chart would not be appropriate,

since the different entries in the table do not represent parts of a single

whole.

Looking at Data ? Distributions

Percent of female doctorates

60

50

40

30

20

10

0

Comp.

Sci.

Life

Sci.

Educ.

Engin.

Phys. Psych.

Sci.

Cause

Motor vehicles

Falls

Drowning

Fires

Poisoning

Other causes

Percent

46

15

4

4

8

23

Percent of accidental deaths

1.14 (a) Below. For example, ?Motor Vehicles? is 46% since 41,893

= 0.4627 . . .. The

90,523

?Other causes? category is needed so that the total is 100%. (b) Below. The bars may be

in any order. (c) A pie chart could also be used, since the categories represent parts of a

whole (all accidental deaths).

40

30

20

10

0

Motor

Vehicle

Falls Drowning Fires

Poison

Other

Causes

1.15 Figure 1.10(a) is strongly skewed to the right with a peak at 0; Figure 1.10(b) is

somewhat symmetric with a central peak at 4. The peak is at the lowest value in 1.10(a),

and at a central value in 1.10(b).

1.16 The distribution is skewed to the right with a single peak. There are no gaps or outliers.

1.17 There are two peaks. Most of the ACT states are located in the upper portion of the

distribution, since in such states, only the stronger students take the SAT.

1.18 The distribution is roughly symmetric. There are peaks at .230?.240 and .270?.290.

The middle of the distribution is about .270. Ignoring the outlier, the range is about

.345 ? .185 = .160 (or .350 ? .180 = .170).

1.19 Sketches will vary. The distribution of coin years would be left-skewed because newer

coins are more common than older coins.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download