Ungrouped Data



Chapter 20A | p. 500Key Statistical ConceptsWords commonly used in Statistics:Population – an entire collection of individuals about which we want to draw conclusionsCensus – the collection of information about the whole populationSample – a subset of the population which should be chosen at random to avoid bias in the resultsSurvey – the collection of information from a sampleData and Types of DataData – information about the individuals in a populationNumerical (quantitative) data – in the form of a numberContinuous – any value in a range (decimal, etc.) result of MEASURINGOften use a histogram Discrete – SPECIFIC values (usually whole numbers) result of COUNTINGBar graph no touching-right146050DON’T FORGET TO LABEL TITLE, AXES, UNITS, VARIABLES00DON’T FORGET TO LABEL TITLE, AXES, UNITS, VARIABLESCategorical (qualitative) data – sorted into distinct groups or categoriesOrdinal – can be rankedNominal – cannot be rankedright285115Usually placed from highest bar to lowestParameter – a numerical quantity measuring some aspect of a populationStatistic – a quantity calculated from data gathered from a sample (usually used to estimate a population parameter)Distribution – pattern of variation of dataMeasures of central tendency can be affected by outlierscenter889000right8890Benefits of Frequency Polygons:Using percent or decimal allows for a fair comparison with other distributions of data00Benefits of Frequency Polygons:Using percent or decimal allows for a fair comparison with other distributions of data828675889000Frequency tablesChapter 20B | P. 505Measuring the Center of Data Ungrouped Dataright762000Mean – the sum of data entries divided by the number of entriesNon-resistant measure of center – influenced by data values in the setIf it is considered to be inaccurate, it should not be used in discussionMedian – middle value of all data pointsIf even number of points, average between two middle valuesHalf of the data is less than or equal to the median, and half is greater than or equalright89535Data point which contains median:n+1200Data point which contains median:n+12Resistant measure of center – the only measure that will locate the TRUE CENTER regardless of data set’s featuresMode – data value that occurs most oftenNo mode, one more, or more than one mode (bimodal) is possibleFor continuous numerical data, we cannot talk about a mode because no two values will be EXACTLY equalModal class – class or group that occurs most frequentlyRelationship between Mean and Median for Different DistributionsIf the data set has symmetry, both the mean and the median should accurately measure the center of the distribution106045020955002949012590550032607251905000Grouped Data and Weighted DataGROUPED DATA: Calculate using midpoints of dataWEIGHTED DATA: Multiply by weightChapter 20C | p. 517Measuring the Spread of DataRange – the difference between highest and lowest value in data setNOT reliable:Uses only 2 data pointsMay be influenced by outliersright698500Quartiles – three points that divide the data set into four equal groupsQ1 between smallest and the median (also the 25th percentile) Q2 median (50th percentile)Q3 between median and largest (75th percentile)Interquartile range – difference between Q3 and Q1Chapter 20D | p. 521BoxplotsBoxplots are visual displays; they are ideal for:Comparing distributions of multiple data setsWhen you have a large data set, so that the key tendencies are immediately evident39272121829435Negative00Negative2327811838325Symmetrical00Symmetricalcenter1829591Positive00Positive40786052882900021297902946400022352029431600Interpreting a Box PlotOutliersOutliers – extraordinary data separated from the main body of the dataright1098600Marked with an asterisk on a boxplotNot included in the whiskers of box plotChapter 20E | p. 526Cumulative Frequency Graphs4347210635000Percentile – the score which a certain percentage of the data lies at or belowPercentile rank – percentage of scores that fall below a given scoreCumulative frequency graph – shows the cumulative totals of a set of values up to each of the points on the graphChapter 20F | p. 531Variance and Standard DeviationUngrouped DataVariance – average squared difference of the scores from the meanBecause it’s squared, more weight given to extreme valuesCalculated in units squared not the same units as scores in setSolved by calculating standard deviation98298029182400Standard deviation – square root of the variance (average distance of the scores from the mean)right20002500Samples rarely contain extreme values when compared to entire populationsThus, variance and standard deviation are less than expectedUse n-1 instead of nFor Grouped Datacenter4249700 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download