2



2.1

A. A 26% 13/50

B 56% 28/50

C 18% 9/50

B.

[pic]

C.

D.[pic]

2.15

A

91 94 97 100 102 102 103 108 111 112 115 115 116 116 117 117 117 122 122 123 124 128 129 130 132

B

The stem and leaf gives more information since the way it is presented it is easier to see where the bulk of values are and it is easier to see which values occur most often. The Stem and leaf is only a good choice for this problem since the values are concentrated, the more disperse the values the less effective the stem and leaf becomes.

C

117 Gallons is most likely to be purchased since it is the mode. There are three values for 117 while the next closest number to be the mode only has two values.

D

Yes there is a concentration around the middle/upper bracket of the 11* category. However it is interesting to note that there are not values for 118,119,120,121 before we have a bulk of value in the range of 122-124.

2.19

A 35 85 110 120 170 180 240 260 300 380 380 460

B

|3 |5 |16 |  |29 |  |42 |  |

|4 |  |17 |0 |30 |0 |43 |  |

|5 |  |18 |0 |31 |  |44 |  |

|6 |  |19 |  |32 |  |45 |  |

|7 |  |20 |  |33 |  |46 |0 |

|8 |5 |21 |  |34 |  | | |

|9 |  |22 |  |35 |  | | |

|10 |  |23 |  |36 |  | | |

|11 |0 |24 |0 |37 |  | | |

|12 |0 |25 |  |38 |00 | | |

|13 |  |26 |0 |39 |  | | |

|14 |  |27 |  |40 |  | | |

|15 |  |28 |  |41 |  | | |

C In this situation the ordered array provides more value to the interpreter since there is no concentration to the values presented. Neither is really a good fit for this data though. A box and whisker plot or some other display option that can make better use at representing the low concentration of data and extreme outliers would be more appropriate.

D According to the data, battery life has two occurrences at 380 and a max of one occurrence at other values. This is misleading though, if we were to cut our data set in half at the 26, so we have the range of 3-23 and 26-46(equal size ranges starting and ending with the extreme values) we would see that 8 of our values would be in the first subset of 3-23, while only 4 would be in the second subset of 26-46.

2.25

A.

(1870-1522)/6=58

|Insulator Force Frequency Distribution |

|Force Applied |Frequency |Percentage |

|1580 but less than 1638 |5 |16.67% |

|1639 but less than 1696 |6 |20.00% |

|1696 but less than 1754 |7 |23.33% |

|1754 but less than 1812 |8 |26.67% |

|1812 but less than 1870 |4 |13.33% |

B.

[pic]

[pic]

D. The insulators are of significant strength since not a single value was below the required breaking point. All but three of the insulators had breaking points in excess of 1600lbs.

3.1

A. N = 5

Sample Data set {7,4,9,8,2}

A. Mean

Sample mean is the sum of the values divided by number of values (PG 97)

(7+4+9+8+2)/5= 6

Median

The middle value in data set. Odd number of values median is middle number. Even amount of values, median is average of two closest values.

Median = (N+1)/2 ranked value (PG 99)

2,4,7,8,9 (5+1)/2=3 which is the third value which is 7

Mode

The value that appears most often in a data set, often there is no mode (PG100)

{7,4,9,8,2} There is no mode for this data set

B. Range

The range is equal to the largest value minus the smallest value (PG105)

9-2 =7

First Quartile (PG101)

Q1 = (n+1)/4 ranked value

Third Quartile

Q3 = 3(n+1)/4 ranked value

Rules for Quartiles (PG102)

Rule 1: if whole number, than quartile = to that ranked value

Rule 2: if fractional half than quartile is average between two closest ranked values

Rule 3: if fraction other than half, round to closest ranked value.

Inter Quartile Range

Also called Midspread is the difference between the third and first quartile. (PG106)

{7,4,9,8,2}

Q1=6/4=1.5

Average between 2 and 4 =3

Q3=18/4= 4.5

Average between 8 and 9 =8.5

Inter Quartile Range for this data set would be 8.5-3= 5.5

Sample Variance

Is the sum of the squared differences around the mean divided by the sample size minus 1 (PG107)

Data set {7,4,9,8,2}

In Excel use the formula VAR for answer of 8.5

By hand

Mean = 6

Sample size =5

(7-6)^2 + (4-6) ^2 + (9-6) ^2 + (8-6) ^2 + (2-6) ^2

(5-1)

(1)^2 + (-2)^2 + (3)^2 + (2)^2 + (-4)^2

4

1+4+9+4+16 34 = 8.5 as variance

4 4

Standard deviation

The sample standard deviation is the square root of the sum of the squared differences around the mean divided by the sample size minus one. (PG 107)

NOTE: To calculate take the square root of the variance.

(8.5)^(1/2)=2.92

To calculate using excel, use the formula STDEV

Coefficient of Variation

The coefficient of variation is equal to the standard deviation divided by the mean, multiplied by 100% (PG 110)

STD=2.92

=2.92/6*100%

=48.67

C. Z Scores

An extreme outlier or value is a value located far away from mean. A Z score is an outlier if it is +or – 3.0 away from the mean.

Mean = 6

|Value |Formula | |Outlier |

|7 |(7-6)/2.915476 |0.342997 |N |

|4 |(4-6)/2.915477 |-0.68599 |N |

|9 |(9-6)/2.915478 |1.028991 |N |

|8 |(8-6)/2.915479 |0.685994 |N |

|2 |(2-6)/2.915480 |-1.37199 |N |

D. Shape of data set (PG 112,113)

Symmetrical is a bell curve design Mean = Median

Negative or left skewed, Mean < median

Positive or Right Skewed, Mean > median

In our data set of {7,4,9,8,2}, the mean is 6, and median is 7 so our curve will be negative or left skewed.

3.3

N=7, {12, 7, 4, 9, 0, 7, 3}

A. Mean= 42/7=6

Median= (7+1)/2=4 {0, 3, 4, 7, 7, 9, 12} = 7

Mode=7

B. Range= 12-0=12

Q1= (7+1)/4=2 ranked value, which is 3

Q3=3(7+1)/4=6 ranked value, which is 9

Inter Quartile Range= 9-3=6

Variance=

(0-6)^2+(3-6)^2+(4-6)^2+(7-6)^2+(7-6)^2+(9-6)^2+(12-6)^2

7-1

36+9+4+1+1+9+36 96 =16

6 6

Standard Deviation= 16^(1/2)=4

Coefficient of variation= 4/6*100= 66.67

C. Z Scores

|Value |Formula | |Outlier |

|0 |(0-6)/4 |-1.5 |N |

|3 |(3-6)/4 |-0.75 |N |

|4 |(4-6)/4 |-0.5 |N |

|7 |(7-6)/4 |0.25 |N |

|7 |(7-6)/4 |0.25 |N |

|9 |(9-6)/4 |0.75 |N |

|12 |(12-6)/4 |1.5 |N |

D. Shape of data set

Mean = 6, Median = 7

Our curve will be negative or left skewed.

3.7

A.

Mean 31+33.75+35.05+36.15+40.25+43=219.20, 219.20/6=$36.53

Median (6+1)/2 = 3.5 ranked value, 35.05 +36.15= 71.2/2=$35.60

Q1 (6+1)/4 = 1.75 ranked value = 33.75 (rounding rule)

Q3 3(6+1)/4= 5.25 ranked value = 40.25 (rounding rule)

B.

Variance

(31-36.53)^2+(33.75-36.53)^2+(35.05-36.53)^2+(36.15-36.53)^2+(40.25-36.53)^2+(43-36.53)^2

(6-1)

(-5.53)^2+(-2.78)^2+(-1.48)^2+(-.38)^2+(3.72)^2+(6.46)^2

(5)

30.581 + 7.728 + 2.19 + .144 + 13.838 + 41.861

5

96.343/5 =19.269

Excel formula answer is 19.26867

STD =19.269^.5=4.390

Excel Formula answer is 4.389609

Range

43.00 – 31.00 = 12

Inter Quartile Range

40.25 – 33.75 = 6.5

Coefficient of Variation

STD /mean*100%

4.390/36.54*100%= 12.014

C. IS the data skewed? If so How?

Median =35.60, Mean =36.53

Since the mean is greater than the median the data is positively skewed or right skewed.

D. Conclusions about going to movies based on information from part (a) and then conclusions based on information from part (b).

The information in Part A of the question deals largely with the concentration and centralized nature of the data including mean, median, and quartile 1 and quartile 3. This information is relevant to what I might expect on average if I were to blindly go to the random movie theaters over a large enough time. The information in Part B deals with the explaining all the possible values I may encounter and their overall tendencies.

3.13

A.

Money Market Accounts {4.55 4.50 4.40 4.38 4.38 }

Mean:22.21/5=4.442

Variance:

(4.55 – 4.442)^2 + (4.50 – 4.442)^2 + (4.40 – 4.442)^2 + (4.38 – 4.442)^2 + (4.38 – 4.442)^2

(5-1)

(.108)^2 + (.058)^2 + (-.042)^2 + (-.062)^2 + (-.062)^2

(4)

.01166+.00336+0.00176+.00384+.00384

(4)

.024/4= .00612 Excel answer =.00612

Standard Deviation =.00612^.5=.07823

Q1= (n+1)/4 ranked value =6/4=1.5ranked value which is 4.38

Q3=3(n+1)/4 ranked value=18/4=4.5 ranked value which is (4.55+4.50)/2=4.525

Inter quartile Range =4.38-4.525= -0.145

Coefficient of Variation

STD/mean*100%

.07823/4.442*100 = 1.761144

One Year CD’s {4.94 4.90 4.85 4.85 4.85}

Mean = 24.39/5=4.878

Variance

(4.94 -4.878)^2+(4.90 -4.878)^2+(4.85 -4.878)^2+(4.85 -4.878)^2+(4.95 -4.878)^2

(5-1)

=(0.062)^2+(0.022)^2+(-0.028)^2+(-0.028)^2+(-0.028)^2

(4)

=.00384+.00048+.00078+.00078+.00078

(4)

=.007/4=.00167 Excel answer=.00167

Standard Deviation

=.00167^.5 =0.040866 Excel answer = 0.04086563

Difference in answers is in rounding

Inter quartile Range

Q1= (n+1)/4 ranked value =6/4=1.5ranked value which is 4.85

Q3=3(n+1)/4 ranked value=18/4=4.5 ranked value which is (4.94+4.90)/2=4.92

Inter quartile range = 4.92-4.85=.07

Coefficient of Variation

STD/mean*100%

=0.040866/4.878*100=0.837761

B. Money Market Accounts have more variation than One year CD’s. This can be seen in the variance, Ranges, and Standard deviation. For these figures of that show how disperse the data is the smaller the number the more condensed the data set. The Money market accounts numbers in these areas are all larger than the one year CD’s.

3.21

The following is a data set for a population with N=10

{7,5,11,8,3,6,2,1,9,8}

A. Compute the population mean:

The Population mean is the sum of all the values in the population divided by the population size of N (PG118).

Sum of values is 60. Population mean is 60/10=6

B. Compute the population standard deviation.

The population variance is the sum of the squared differences around the population mean divided by the population size.

Differences between Population variance and Sample variance;

-POP divide by N, Sample divide by n-1

-Population must use POP size (N) and POP mean (µ), sample must use sample size (X, suppose to be X with line over top but do not have the font) and sample mean

Population Variance:

=(1-6)^2+(2-6)^2+(3-6)^2+(5-6)^2+(6-6)^2+(7-6)^2+(8-6)^2+(8-6)^2+(9-6)^2+(11-6)^2

10

=(-5)^2+(-4)^2+(-3)^2+(-1)^2+(0)^2+(1)^2+(2)^2+(2)^2+(3)^2+(5)^2

10

=25+16+9+1+0+1+4+4+9+25

10

=94/10

=9.4

Population STD

=9.4^.5 =3.065941943

EXCEL answer = 3.065941943

3.23

Solved Using Excel

A.

Mean =514/50=10.28 (thousands)

Variance =VAR(A2:A51)= 4.182041

Standard Deviation = =STDEVP(A2:A51) = 2.024451

B.

| Value range |Dev.’s |Occurrences |

| |16.27335 |3 |2 |

| |14.2489 |2 |8 |

|8.255549 |12.22445 |1 plus/minus |32 |

|6.231099 | |-2 |7 |

|4.206648 | |-3 |1 |

| |Separate |Inclusive |% |

|Within 1 |32 |32 |0.64 |

|Within 2 |15 |47 |0.94 |

|Within 3 |3 |50 |1 |

C. The empirical rule (PG 120)

68% results should be within plus/minus 1 deviation

95% results should be within plus/minus 2 deviations

99.7% results should be within plus/minus 3 deviations

These results are close to the empirical rule as outline in the chart below:

|Actual % |Empirical Rule % |

|0.64 |.68 |

|0.94 |.95 |

|1 |.997 |

3.31

The Five Number Summary consists of:

Smallest value, 1st Quartile, Median, 3rd Quartile, Largest number for a series (PG 123)

Chart comparing relationships of these numbers to distribution of data is also found on PG123

A box and Whisker plot is a Graphical summary of the Five number summary. (PG 124)

Important chart for applying box and whisker to distributions is figure 3.5 on pg 125

Data set ordered:

|4 |

| 5 |

| 7 |

|8 |

|16 |

|19 |

|19 |

|20 |

|20 |

|23 |

|24 |

|25 |

|29 |

|29 |

|30 |

|30 |

|30 |

|30 |

|40 |

|56 |

Smallest Value : 4 , Largest Value :56 Median

Median: (23+24)/2 = 23.5

Q1 = (n+1)/4 =21/4=5.25 ranked value= 16

Q3 = 3(n+1)/4= 15.25 ranked value = 30

[pic]

Computer Assignments sections.

Throughout the homework I have checked and verified my answers using microsoft excel. I have found out that my answers are always the same as long as I perform the due dilligance to take my work out enough decimal places.

As noted earlier in 3.1,3.3, 3.21 it takes a little over a page of computation of simple math to come up with the variance and standard deviation but with excel it only takes one simple line and a whole lot less time to check your work. In that page of computation every time I perform a calculation is a chance that I will perform a simple mistake and come up with a wrong answer.

Doing work by hand is nice to have a good understanding of the process and expected results, however it greatly increases chances of coming up with an incorrect answer and takes an enormous amount of time that has an opportunity cost attached to it.

-----------------------

[pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download