CHAPTER 2 : DESCRIPTIVE STATISTICS



CHAPTER 2 : DESCRIPTIVE STATISTICSIntroduction416623597155Organizing and Graphing Qualitative DataOrganizing and Graphing Quantitative DataCentral Tendency MeasurementDispersion MeasurementMean, Variance and Standard Deviation for Grouped DataMeasure of SkewnessOBJECTIVESAfter completing this chapter, students should be able to: Create and interpret graphical displays involve qualitative and quantitative data.Describe the difference between grouped and ungrouped frequency distribution, frequency and relative frequency, relative frequency and cumulative relative frequency.Identify and describe the parts of a frequency distribution: class boundaries, class width, and class midpoint.Identify the shapes of pute, describe, compare and interpret the three measures of central tendency: mean, median, and mode for ungrouped and grouped pute, describe, compare and interpret the two measures of dispersion: range, and standard deviation (variance) for ungrouped and grouped pute, describe, and interpret the two measures of position: quartiles and interquartile range for ungrouped and grouped pute, describe and interpret the measures of skewness: Pearson Coefficient of Skewness.INTRODUCTION Raw data - Data recorded in the sequence in which there are collected and before they are processed or ranked. Array data - Raw data that is arranged in ascending or descending order. Example 1Here is a list of question asked in a large statistics class and the “raw data” given by one of the students:What is your sex (m=male, f=female)?Answer (raw data): mHow many hours did you sleep last night?Answer: 5 hoursRandomly pick a letter – S or Q.Answer: SWhat is your height in inches?Answer: 67 inchesWhat’s the fastest you’ve ever driven a car (mph)?Answer: 110 mphExample 2Quantitative raw dataThese data also called ungrouped dataQualitative raw data ORGANIZING AND GRAPHING QUALITATIVE DATAFrequency Distributions/ TableRelative Frequency and Percentage Distribution Graphical Presentation of Qualitative Data Frequency Distributions / TableA frequency distribution for qualitative data lists all categories and the number of elements that belong to each of the categories.It exhibits the frequencies are distributed over various categoriesAlso called a frequency distribution table or simply a frequency table.The number of students who belong to a certain category is called the frequency of that category.Relative Frequency and Percentage DistributionA relative frequency distribution is a listing of all categories along with their relative frequencies (given as proportions or percentages).It is commonplace to give the frequency and relative frequency distribution together.Calculating relative frequency and percentage of a categoryRelative Frequency of a category= Frequency of that category Sum of all frequenciesPercentage = (Relative Frequency)* 100Example 3A sample of UUM staff-owned vehicles produced by Proton was identified and the make of each noted. The resulting sample follows (W = Wira, Is = Iswara, Wj = Waja, St = Satria, P = Perdana, Sv = Savvy):WWPIsIsPIsWStWjIsWWWjIsWWIsWWjWjIsWjSvWWWWjStWWjSvWIsPSvWjWjWWStWWWWStStPWjSvConstruct a frequency distribution table for these data with their relative frequency and percentage.Solution:CategoryFrequencyRelative FrequencyPercentage (%)Wira1919/50 = 0.380.38*100= 38Iswara80.1616Perdana40.088Waja100.2020Satria50.1010Savvy40.088Total501.00100Graphical Presentation of Qualitative DataBar GraphsA graph made of bars whose heights represent the frequencies of respective categories.Such a graph is most helpful when you have many categories to represent.Notice that a gap is inserted between each of the bars.It has=>simple/ vertical bar chart=> horizontal bar chart => component bar chart => multiple bar chartSimple/ Vertical Bar ChartTo construct a vertical bar chart, mark the various categories on the horizontal axis and mark the frequencies on the vertical axisHorizontal Bar ChartTo construct a horizontal bar chart, mark the various categories on the vertical axis and mark the frequencies on the horizontal axis.Example 4: Refer Example 3, Another example of horizontal bar chartFigure 2.4: Number of students at Diversity College who are immigrants, by last country of permanent residence Component Bar Chart To construct a component bar chart, all categories is in one bar and every bar is divided into components. The height of components should be tally with representative frequencies.Example 5Suppose we want to illustrate the information below, representing the number of people participating in the activities offered by an outdoor pursuits centre during Jun of three consecutive years.200420052006Climbing213436Caving101221Walking7585100Sailing363640Total142167191 Figure 2.5Multiple Bar ChartTo construct a multiple bar chart, each bars that representative any categories are gathered in groups. The height of the bar represented the frequencies of categories.Useful for making comparisons (two or more values). Example 6: Refer example 5,Another example of horizontal bar chart: Preferred snack choices of students at UUMThe bar graphs for relative frequency and percentage distributions can be drawn simply by marking the relative frequencies or percentages, instead of the class frequencies. Pie ChartA circle divided into portions that represent the relative frequencies or percentages of a population or a sample belonging to different categories.An alternative to the bar chart and useful for summarizing a single categorical variable if there are not too many categories.The chart makes it easy to compare relative sizes of each class/category.The whole pie represents the total sample or population. The pie is divided into different portions that represent the different categories.To construct a pie chart, we multiply 360o by the relative frequency for each category to obtain the degree measure or size of the angle for the corresponding categories.Example 7 Figure 2.8Example 8 Table 2.7Movie GenresFrequencyRelative FrequencyAngle SizeComedyActionRomanceDramaHorrorForeignScience Fiction543628282216160.270.180.140.140.110.080.08 360*0.27=97.2 360*0.18=64.8360*0.14=50.4360*0.14=50.4360*0.11=39.6360*0.08=28.8360*0.08=28.82001.00360Figure 2.93. Line Graph/Time Series GraphLine graphs are more popular than all other graphs combined because their visual characteristics reveal data trends clearly and these graphs are easy to create. When analyzing the graph, look for a trend or pattern that occurs over the time period. Example is the line ascending (indicating an increase over time) or descending (indicating a decrease over time).Another thing to look for is the slope, or steepness, of the line. A line that is steep over a specific time period indicates a rapid increase or decrease over that period.Two data sets can be compared on the same graph (called a compound time series graph) if two lines are used. Data collected on the same element for the same variable at different points in time or for different periods of time are called time series data.A line graph is a visual comparison of how two variables—shown on the x- and y-axes—are related or vary with each other. It shows related information by drawing a continuous line between all the points on a grid. Line graphs compare two variables: one is plotted along the x-axis (horizontal) and the other along the y-axis (vertical). The y-axis in a line graph usually indicates quantity or percentage,The horizontal x-axis often measures units of time. As a result, the line graph is often viewed as a time series graphExample 9A transit manager wishes to use the following data for a presentation showing how Port Authority Transit ridership has changed over the years. Draw a time series graph for the data and summarize the findings.YearRidership(in millions)1990199119921993199488.085.075.776.675.4 The graph shows a decline in ridership through 1992 and then leveling off for the years 1993 and 1994.Exercise 1The following data show the method of payment by 16 customers in a supermarket checkout line. Here, C = cash, CK = check, CC = credit card, D = debit and O = KCKCCCDOCCKCCDCCCCKCKCCConstruct a frequency distribution table.Calculate the relative frequencies and percentages for all categories.Draw a pie chart for the percentage distribution.The frequency distribution table represents the sale of certain product in ZeeZee Company. Each of the products was given the frequency of the sales in certain period. Find the relative frequency and the percentage of each product. Then, construct a pie chart using the obtained information.Type of ProductFrequencyRelative FrequencyPercentageAngle SizeABCDE13125911Draw a time series graph to represent the data for the number of worldwide airline fatalities for the given years.Year1990199119921993199419951996No. of fatalities4405109908017325571132A questionnaire about how people get news resulted in the following information from 25 respondents (N = newspaper, T = television, R = radio, M = magazine). NNRTTRNTMRMMNRNTRMNMTRRNNConstruct a frequency distribution for the data.Construct a bar graph for the data.The given information shows the export and import trade in million RM for four months of sales in certain year. Using the provided information, present this data in component bar graph. MonthExportImportSeptemberOctoberNovemberDecember2830322420281714The following information represents the maximum rain fall in millimeter (mm) in each state in Malaysia. You are supposed to help a meteorologist in your place to make an analysis. Based on your knowledge, present this information using the most appropriate chart and give your comment.StateQuantity (mm)PerlisKedahPulau PinangPerakSelangorWilayah Persekutuan Kuala LumpurNegeri SembilanMelakaJohorPahangTerengganuKelantanSarawakSabah4355121637216641003390223876105012559868784562.3ORGANIZING AND GRAPHING QUANTITATIVE DATA2.3.1Stem-and-Leaf Display In stem and leaf display of quantitative data, each value is divided into two portions – a stem and a leaf. Then the leaves for each stem are shown separately in a display.Gives the information of data pattern.Can detect which value frequently repeated.Example 1012 9 10 5 12 23 7 13 11 12 31 28 37 641 38 44 13 22 18 19Solution:09 5 7 612 0 2 3 1 2 4 3 8 925 3 8 236 1 7 841 4Frequency DistributionsA frequency distribution for quantitative data lists all the classes and the number of values that belong to each class.Data presented in form of frequency distribution are called grouped data.The class boundary is given by the midpoint of the upper limit of one class and the lower limit of the next class. Also called real class limit.To find the midpoint of the upper limit of the first class and the lower limit of the second class, we divide the sum of these two limits by 2. class boundary e.g.: Class Width (class size) Class width = Upper boundary – Lower boundarye.g. : Width of the first class = 600.5 – 400.5 = 200Class Midpoint or Mark e.g: Constructing Frequency Distribution Tables1.To decide the number of classes, we used Sturge’s formula, which isc = 1 + 3.3 log n where c is the no. of classes n is the no. of observations in the data set.2. Class width,This class width is rounded to a convenient number.3.Lower Limit of the First Class or the Starting PointUse the smallest value in the data set.Example 11The following data give the total home runs hit by all players of each of the 30 Major League Baseball teams during 2004 seasonNumber of classes, c = 1 + 3.3 log 30 = 1 + 3.3(1.48) = 5.89 6 classClass width, Starting Point = 135Table 2.10 Frequency Distribution for Data of Table 2.9Total Home RunsTallyf135 – 152153 – 170171 – 188189 – 206207 – 224225 – 242|||| |||||||||| |||| ||||||||1025634Relative Frequency and Percentage DistributionsExample 12 (Refer example 11)Table 2.11: Relative Frequency and Percentage Distributions Total Home RunsClass BoundariesRelative Frequency%135 – 152153 – 170171 – 188189 – 206207 – 224225 – 242134.5 less than 152.5152.5 less than 170.5170.5 less than 188.5188.5 less than 206.5206.5 less than 224.5224.5 less than 242.50.33330.06670.16670.20.10.133333.336.6716.67201013.33Sum1.0100%Graphing Grouped DataHistogramsA histogram is a graph in which the class boundaries are marked on the horizontal axis and either the frequencies, relative frequencies, or percentages are marked on the vertical axis. The frequencies, relative frequencies or percentages are represented by the heights of the bars.In histogram, the bars are drawn adjacent to each other and there is a space between y axis and the first bar.Example 13 (Refer example 11)134.5 152.5 170.5 188.5 206.5 224.5 242.5Frequency histogram for Table 2.10PolygonA graph formed by joining the midpoints of the tops of successive bars in a histogram with straight lines is called a polygon.Example 13 134.5 152.5 170.5 188.5 206.5 224.5 242.5Frequency polygon for Table 2.10For a very large data set, as the number of classes is increased (and the width of classes is decreased), the frequency polygon eventually becomes a smooth curve called a frequency distribution curve or simply a frequency curve.Frequency distribution curveShape of HistogramSame as polygon.For a very large data set, as the number of classes is increased (and the width of classes is decreased), the frequency polygon eventually becomes a smooth curve called a frequency distribution curve or simply a frequency curve.The most common of shapes are:(i) Symmetric(ii) Right skewed(iii) Left skewedSymmetric histogramsRight skewed and Left skewedDescribing data using graphs helps us insight into the main characteristics of the data.When interpreting a graph, we should be very cautious. We should observe carefully whether the frequency axis has been truncated or whether any axis has been unnecessarily shortened or stretched.Cumulative Frequency DistributionsA cumulative frequency distribution gives the total number of values that fall below the upper boundary of each class.Example 14: Using the frequency distribution of table 2.11, Total Home RunsClass BoundariesCumulative Frequency135 – 152153 – 170171 – 188189 – 206207 – 224225 – 242134.5 less than 152.5152.5 less than 170.5170.5 less than 188.5188.5 less than 206.5206.5 less than 224.5224.5 less than 242.51010+2=1210+2+5=1710+2+5+6=2310+2+5+6+3=2610+2+5+6+3+4=30OgiveAn ogive is a curve drawn for the cumulative frequency distribution by joining with straight lines the dots marked above the upper boundaries of classes at heights equal to the cumulative frequencies of respective classes.Two type of ogive:(i) ogive less than(ii)ogive greater thanFirst, build a table of cumulative frequency.Example 15 (Ogive Less Than)56633730 – 3940 – 4950 – 5960 - 6970 – 7980 - 8930Number of students (f)TotalEarnings (RM) Cumulative frequency (F)Earnings (RM)Less than 29.5Less than 39.5Less than 495Less than 59.5Less than 69.5Less than 79.5Less than 89.50511172023300510152025303529.539.549.559.569.579.589.5EarningsCumulative Frequency Example 16 (Ogive Greater Than)56633730 – 3940 – 4950 – 5960 - 6970 – 7980 - 8930Number of students (f)TotalEarnings (RM) Cumulative Frequency (F)Earnings (RM)302519131070More than 29.5More than 39.5More than 49.5More than 59.5More than 69.5More than 79.5More than 89.50510152025303529.539.549.559.569.579.589.5EarningsCumulative Frequency 2.3.7Box-PlotDescribe the analyze data graphically using 5 measurement: smallest value, first quartile (K1), second quartile (median or K2), third quartile (K3) and largest value.Smallest valueLargest value K1 Median K3Largest value K1 Median K3Largest value K1 Median K3Smallest valueSmallest valueFor symmetry dataFor left skewed data For right skewed data ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download