Mrsrisinger.weebly.com



Name:______________________________Test Review 2.1 Displaying Data Period:______How much oil wells in a given field will ultimately produce is crucial information in deciding whether to drill more wells. Here are the estimated total amounts of oil recovered from 38 wells in the Devonian Richmond Dolomite area of the Michigan basin, in thousands of barrels. The data is provided in ascending order, along with a dotplot. 322354349577092132535435059709815313745506374157193337465365802135384856668277078016637000What measures would you use to describe the center and spread of these data? Justify your answer. Use median and interquartile range, since the distribution is skewed, there is a strong outlier, and these measures are resistant to outliers. Find the five-number summary for these data. Min. = 3 Q1 = 35 Med. = 47 Q3 = 65 Max. = 157Are there any outliers? Justify your answer. 1.5 x IQR = 1.5 x 30 = 45; 35 – 45 = –10 (no low outliers); 65 + 45 = 110 so 157 is an outlier. Since the mean is not resistant to the strong outlier to the right, it will be higher than the median, which is not influenced by the outlier.Draw a boxplot of this distribution. 19361422548300For the oil well data on the previous page, how can you tell without doing any calculations, that the mean of these data is larger than the median? Since the mean is not resistant to the strong outlier to the right, it will be higher than the median, which is not influenced by the outlier.Nitrates are organic compounds that are a substantial component of agricultural fertilizers. When those fertilizers run off into streams, the nitrates can have a toxic effect on animals that live in those streams. An ecologist studying nitrate pollution in two streams collects data on nitrate concentrations at 42 places on Stony Brook and 42 places on Mill Brook. His results are given in the dotplots and computer output below. 13885806350000VariableTotalMeanSE MeanStDevMinimumQ1MedianQ3MaximumCountStony Brook425.5240.4512.9220.0003.5005.0007.50012.000Mill Brook427.9290.7114.6070.0004.5008.25010.50020.000Determine if there are any outliers in each distribution. Show your work. Outliers-- Lower: Q1 – 1.5(IQR) Upper: Q3 – 1.5(IQR) and IQR = Q3 – Q1 For Stony BrookLow outliers: 3.5 – 1.5(7.5 - 3.5) = -2.5 so no low Upper outliers: 7.5 + 1.5(7.5 - 3.5) = 13.5 so no high outliers For Mill Brook: Low outliers: 4.5 – 1.5(10.5 - 4.5) = -4.5 so no low Upper outliers: 10.5 + 1.5(10.5 - 4.5) = 19.5 so 20 is a high outlier.Draw parallel boxplots of these two distributions. Be sure to label the plots and provide a scale. 21031195546600Write a few sentences comparing the nitrate concentrations in Stony Brook and Mill Brook.Stoney Brook mill has a lower median at 5 mg/l than Mill Brooks at 8.25 mg/l. Stoney Brook Mill has significantly less variation with a range of 12 and an IQR of 4 while Mill Brooks has a range of 20 and an IQR of 6. Both Stony Brook and Mill Brook are reasonably symmetric however Mill Brook has an outlier on the upper end. 50% of Mill Brook’s concentration readings are above the lower 75% of Stony Brook. On August 7, 2007 Barry Bonds hit his 756th home run, breaking the all-time career home run record, formerly held by Hank Aaron. Does that make Bonds a better home run hitter than Aaron? Let’s compare their annual home run production over their entire careers. Below is a side-by-side stemplot. (Bonds played between 1986 and 2007. Aaron played between 1954 and 1978.) 11239503619500Number of Home Runs per YearBondsAaron509 610 2 38 6 5 5 420 4 6 7 97 7 4 4 3 330 2 4 4 8 9 99 6 6 5 5 2 040 0 4 4 4 4 5 75637Key: 1|4 = 14 home runsUse the plot to write a few sentences comparing Bonds and Aaron as home run hitters.For the most part both distributions are very similar. The centers for both players are in the mid 30’s. The spreads are similar with the exception of Bond’s 73 home run season which appears to be an outlier as there is a gap from 49 to 73. With the exception of of Bond’s 73 run season, both players’ distributions are skewed left.Draw parallel boxplots of these two distributions. Be sure to label the plots and provide a scale. 87312539598AaronBonds515253545556575AaronBonds515253545556575The histogram below shows the number of hurricanes making landfall in the United States from 1900 to 2008. Describe the shape, center, and spread of the distribution. 297166322253The distribution is skewed right and has single peak at 1 hurricane and a range of 6 hurricanes. 00The distribution is skewed right and has single peak at 1 hurricane and a range of 6 hurricanes. 194405119683005. The boxplots shown above summarize two data sets, I and II. Based on the boxplots, which of the following statements about these two data sets CANNOT be justified? (A)The range of data set I is equal to the range of data set II. (B)The interquartile range of data set I is equal to the interquartile range of data set II. (C)The median of data set I is less than the median of data set II. (D)Data set I and data set II have the same number of data points. (E)About 75% of the values in data set II are greater than or equal to about 50% of the values in data set I.The statistics below provide a summary of the distribution of heights, in inches, for a simple random sample of 200 young children. Mean: 46 inchesMedian: 45 inchesStandard Deviation: 3 inchesFirst Quartile: 43 inchesThird Quartile: 48 inchesAbout 100 children in the sample have heights that are(A)less than 43 inches (B)less than 48 inches (C)between 43 and 48 inches (D)between 40 and 52 inches (E)more than 46 inchesThe stemplot below shows the yearly earnings per share of stock for two different companies over a sixteen-year period.2704939243010030632402467900Company ACompany B058, 75, 96, 9892, 91, 90, 82, 78, 43, 38, 26101, 10, 17, 21, 43, 43, 53, 65, 7349, 47, 44, 00209, 27, 2973, 27, 05, 023Which of the following statements is true?(A)The median of the earnings of Company A is less than the median of the earnings of the Company B.(B)The range of the earnings of Company A is less than the range of the earnings of Company B.(C)The third quartile of Company A is smaller than the third quartile of Company B.(D)The mean of the earnings of Company A is greater than the mean of the earnings of Company B.(E)The interquartile range of Company A is twice the interquartile range of Company B873457000 8. A botanist is studying the petal lengths, measured in millimeters, of two species of lilies. The boxplots above illustrate the distribution of petal lengths from two samples of equal size, one from species A and the other from species B. Based on these boxplots, which of the following is a correct conclusion about the data collected in this study?(A)The interquartile ranges are the same for both samples.(B)The range for species B is greater than the range for species A.(C)There are more petal lengths that are greater than 70 mm for species A than there are for species B.(D)There are more petal lengths that are greater than 40 mm for species B than there are for species A.(E)There are more petal lengths that are less than 30 mm for species B than there are for species A. To which of the boxplots does the histogram above belong?818515342900 10 20 30 40 50 600 10 20 30 40 50 60(A) 818515225850 10 20 30 40 50 600 10 20 30 40 50 60(B)814122166450 10 20 30 40 50 600 10 20 30 40 50 60(C) 752475565150 10 20 30 40 50 600 10 20 30 40 50 60(D)800525208820 10 20 30 40 50 600 10 20 30 40 50 60(E) 10. Which of the following is a true statement?Stemplots are useful for both quantitative and categorical data sets.Stemplots are equally useful for small and very large data sets.Stemplots can show symmetry, gaps, clusters and outliers.Stems should be skipped only if there is not data value for the particular stem.1729409286827Cumulative Frequency20 30 40 50 601.00.75.50.25Ages of Employees00Cumulative Frequency20 30 40 50 601.00.75.50.25Ages of EmployeesWhether or not to provide a key depends upon the relative importance of the data being displayed.Given the cumulative plot above, and using the most commonly accepted definition of outliers, what ages would be considered outliers?Between 20 and 25Between 20 and 30 Between 20 and 40Between 20 and 25, or between 55 and 60Between 20 and 30, or between 50 and 60Given the cumulative plot above, and using the most commonly accepted definition of outliers, what ages would be considered outliers?Between 20 and 25Between 20 and 30 Between 20 and 40Between 20 and 25, or between 55 and 60Between 20 and 30, or between 50 and 60Given that the median is 270 and the interquartile range is 20, which of the following statements is true?Fifty percent of the data are greater than or equal to 270.Fifty percent of the data are between 260 and 280.Seventy-five percent of the data are less than or equal to 280.The mean is between 250 and 290.The standard deviation is approximately 13.5.A random sample of golf scores gives the following summary statistics: n = 20, x = 84.5 Sx= 11.5, minX = 68, Q1 = 78, Med = 86, Q3 = 91 maxX =112. What can be said about the number of outliers? 012At least 192353861844B246810121416AB246810121416AAt least 2Given these parallel boxplots, which of the following is incorrect?The ranges are the same.The interquartile ranges are the same.Both sets are skewed to both lower and higher values.Set A may not be symmetric.Set A may have 100 times as many values as set B.The amount of Omega 3 fish oil in capsules labeled 1,000 mg is measured for four manufacturers’ products yielding the following:13709651746251030102010101000990980970 A B C DManufacturerFish Oil (mg.)1030102010101000990980970 A B C DManufacturerFish Oil (mg.) Which of the manufacturers; samples has the smallest range?ABCDThere is insufficient information to answer this question.Which of the following statements is true?Displaying outliers is less problematic when using histograms than when using stemplots.Histograms are more widely used than stemplots or dotplots because histograms display the values of individual observations.Unlike other graphs, histogram axes do not need to be labeled.A histogram of a categorical variable can pinpoint clusters and gaps.Two students working with the same set of data may come up with histograms that look different.Which of the following statements is incorrect?The range of the sample data set can never be greater than the range of the population.While the range is affected by outliers, the interquartile range is not.Changing the order from ascending to descending changes the sign of the range.The range is a single number, not an interval of values.The interquartile range is the range of the middle half of the data.1433444194310 50 60 70 80 90 100 120 1300 50 60 70 80 90 100 120 130Given the histogram above, and using the most commonly accepted definition of outliers, what values would be considered outliers?Between 115 and 120.Between 110 and 120.Between 50 and 55, or between 115 and 120.Between 50 and 55, or between 110 and 120.There are no outliers.139535177528 2 6 10 14 18 22 2 6 10 14 18 22 Given the parallel boxplots above, which of the following is true?All three distributions have the same range.All three distributions have the same interquartile range.All three medians are between 9 and 13.All three distributions appear to be skewed right.All three distributions can reasonably be assumed to be of sample from normally distributed populations.133794124052 0 10 20 30 40 50 60 0 10 20 30 40 50 60To which histogram can the above boxplot be attributed?-267419107435(A)(B)(C) 0 10 20 30 40 50 600 10 20 30 40 50 60(D)(E)0 10 20 30 40 50 600 10 20 30 40 50 60(A)(B)(C) 0 10 20 30 40 50 600 10 20 30 40 50 60(D)(E)0 10 20 30 40 50 600 10 20 30 40 50 60488632519770730 10 20 30 40 50 60000 10 20 30 40 50 60???????????????????????????????Which of the following measures are most usually given to describe the center and spread of a distribution as given in the dotplot?Mean and standard deviationMean and interquartile rangeMean and rangeMedian and interquartile rangeMedian and rangeWhen there are multiple gaps and clusters, which of the following is the best choice to give an overall picture of a distribution?Mean and standard deviationMedian and interquartile rangeBoxplot with its five-number summaryStemplot or histogramNone of the above are really helpful in showing gaps and clusters In which of the following histograms is the mean less than the median?(A)(B) (C)(D) (E) ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download