Kenwood Academy
AP Statistics
Chapter 1 - Exploring Data
|Introduction - Data Analysis: Making Sense|Objectives: |
|of Data |DEFINE “Individuals” and “Variables” |
| |DISTINGUISH between “Categorical” and “Quantitative” variables |
| |DEFINE “Distribution” |
|Statistics | |
| | |
|Data Analysis |the science of data |
| | |
| |A process of describing data using graphs and numerical summaries |
|Individuals | |
| | |
| |the objects described by a set of data. Individuals may be people, animals or things |
|Variable | |
| | |
| |any characteristic of an individual. A variable can take different values for different individuals. |
| |Categorical variable – places an individual into one of several groups or categories such as hair color and marital |
| |status |
|Distribution |Quantitative variable – numerical values for which it makes sense to do averages |
| |Is zip code categorical or quantitative? |
|How to Explore Data: | |
| |Tells what values a variable takes and how often it takes these values. |
| | |
| |Examine each variable by itself. Then study relationships among the variables. |
|Example |Start with a graph or graphs. Add numerical summaries |
| | |
| | |
| |Check Your Understanding pg. 5 |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Displaying Distributions with Graphs | |
| | |
| | |
| | |
| | |
|Displaying Categorical Variables: Bar and | |
|Pie Graphs | |
| | |
| |Objectives: |
|Frequency table |CONSTRUCT and INTERPRET bar graphs and pie charts |
| |RECOGNIZE “good” and “bad” graphs |
| |CONSTRUCT and INTERPRET two-way tables |
| |DESCRIBE relationships between two categorical variables |
| |ORGANIZE statistical problems |
| | |
| | |
| |The number one rule of data analysis is to MAKE a PICTURE. To decide what type of picture (visual display) is |
| |appropriate, identify the variable. Is it categorical (counts) or quantitative (measurement). |
| | |
| | |
| | |
| |Displays the count (frequency) of observations in each category or class. |
| | |
| |[pic] |
| | |
| | |
| |Bar graphs compare several quantities by |
| |comparing the heights of bars that represent |
| |those quantities |
|Bar Graph: |Bar graphs have spaces between each category |
| |of the. |
| |The order of the categories is not important |
| |Either counts or proportions may be shown on |
| |the vertical axis |
| |Make sure you include a title and appropriate |
| |labels for each axes. |
| | |
| | |
| |Must include all the categories that make up |
|Pie Graphs: |the whole |
| |Should use computer software to construct. |
| |- |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Misleading Graphs | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Τωο−Ωαψ Ταβλεσ ανδ |A two-way table of counts organizes data about two categorical variables. |
|Μαργιναλ Διστριβυτιονσ | |
| | |
| |What are the variables described by this two-way table? |
| | |
| | |
| |How many young adults were surveyed? |
| | |
| | |
| | |
| | |
| | |
| | |
|Marginal Distribution | |
| |The marginal distribution of one of the categorical variables in a two-way table of counts is the distribution of values|
| |of that variable among all individuals described by the table. |
| | |
| |To examine a marginal distribution, |
| |Use the data in the table to calculate the marginal distribution (in percents) of the row or column totals. |
| |Make a graph to display the marginal distribution. |
| | |
| |See Example page 11-12 |
|Example | |
| | |
| |Check Your Understanding pg. 12 |
| | |
| | |
| | |
| | |
| | |
|Relationships Between Categorical | |
|Variables: Conditional Distributions | |
| | |
| |A Conditional Distribution of a variable describes the values of that variable among individuals who have a specific |
| |value of another variable. |
| | |
| |To examine or compare conditional distributions, |
| |Select the row(s) or column(s) of interest. |
| |Use the data in the table to calculate the conditional distribution (in percents) of the row(s) or column(s). |
| |Make a graph to display the conditional distribution. |
| |Use a side-by-side bar graph or segmented bar graph to compare distributions. |
| | |
| |See Example page 15 |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Example | |
| | |
| |Describe the conditional distribution in the chart above. |
| | |
| | |
| |Check Your Understanding pg. 17 |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Organizing a Statistical Problem | |
| | |
| | |
| | |
| | |
| |How to Organize a Statistical Problem: A Four-Step Process |
| |State: What’s the question that you’re trying to answer? |
| |Plan: How will you go about answering the question? What statistical techniques does this problem call for? |
| |Do: Make graphs and carry out needed calculations. |
|Data Exploration page 19-20 |Conclude: Give your practical conclusion in the setting of the real-world problem. |
| | |
| |See Example page 18 |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|1.2 Displaying Quantitative Data with | |
|Graphs | |
| | |
| | |
| | |
| |Objectives: |
|Dotplots |CONSTRUCT and INTERPRET dotplots, stemplots, and histograms |
| |DESCRIBE the shape of a distribution |
| |COMPARE distributions |
| |USE histograms wisely |
| | |
| |How to Construct a Dotplot: |
| |Draw a horizontal axis (a number line) and label it with the variable name. |
| |Scale the axis from the minimum to the maximum value. |
| |Mark a dot above the location on the horizontal axis corresponding to each data value. |
| |Useful for small data sets |
| |[pic] |
| | |
|Don’t Forget Your SOCS! | |
| |How to Examine the Distribution of a Quantitative Variable: |
| |Shape |
| |Outliers |
| |Center |
| |Spread |
|Describing Shape | |
| | |
| | |
| |In general, when looking at graphs: Look for an overall pattern and also for striking deviations from that pattern |
| |To give the overall pattern of a distribution: |
| |Give the center and spread |
|Symmetric |See if the distribution has a simple shape that you can describe in a few words. (skewed to the right, skewed to the |
| |left, symmetric) |
| | |
|Skewness |– a distribution is symmetric if the right and left sides are approximately mirror images of each other. |
| | |
| |- A distribution is skewed to the right (right-skewed) if the right side of the graph (containing the half of the |
| |observations with larger values) is much longer than the left side. |
| |- It is skewed to the left (left-skewed) if the left side of the graph is much longer than the right side. |
| | |
| |Symmetric Skewed - left Skewed - right |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Outliers | |
| | |
| | |
| | |
|Unimodal |– an outlier in any graph of data is an individual observation that falls outside the overall pattern of the graph. |
| |Once you have spotted outliers look for an explanation. |
|Bimodal | |
| | |
|Multimodal | |
| |Distribution with a single peak |
| | |
|Example |Distribution with two clear peaks |
| | |
| |Distribution with more than two peaks |
| | |
| | |
| | |
| |Check Your Understanding page 31 |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Comparing Distributions | |
| | |
| | |
| | |
|Stemplots | |
| |See Example and AP Exam Tip page 32 |
| | |
| | |
| | |
| |How to construct a Stemplot: |
| |Separate each observation into a stem (all but the final digit) and a leaf (the final digit). |
| |Write all possible stems from the smallest to the largest in a vertical column and draw a vertical line to the right of |
| |the column. |
| |Write each leaf in the row to the right of its stem. |
| |Arrange the leaves in increasing order out from the stem. |
| |Provide a key that explains in context what the stems and leaves represent. |
| | |
| |Useful for small to medium data sets |
| |Individual data values are preserved |
| |When you have few stems it may be helpful to split the stems to get a better idea of the shape of the graph |
| | |
| |[pic] |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Splitting Stems | |
| | |
| |When data values are “bunched up”, we can get a better picture of the distribution by splitting stems. |
| |[pic] |
| |Two distributions of the same quantitative variable can be compared using a back-to-back stemplot with common stem |
| | |
| |[pic] |
| | |
| | |
| | |
| |Check Your Understanding page 34-35 |
| | |
| | |
|Back-to-Back Stemplots | |
| |Divide the range of data into classes of equal width. |
| |Find the count (frequency) or percent (relative frequency) of individuals in each class. |
| |Label and scale your axes and draw the histogram. The height of the bar equals its frequency. Adjacent bars should |
| |touch, unless a class contains no individuals. |
| | |
| | |
| | |
| |See Example page 35-36 |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Example | |
| | |
| | |
| | |
|Histograms | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Histograms on the Calculator | |
| |Check Your Understanding page 39 |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| |Check Your Understanding page 41 |
| | |
| | |
| | |
| | |
|Example | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| |Objectives: |
| |MEASURE center with the mean and median |
| |MEASURE spread with standard deviation and interquartile range |
| |IDENTIFY outliers |
| |CONSTRUCT a boxplot using the five-number summary |
| |CALCULATE numerical summaries with technology |
| | |
| | |
| |To find the mean of a set of observations, add their values and divide by the number of observations. If the n |
| |observations are x1, x2, x3, …, xn, their mean is: |
| |[pic] |
| |[pic] ([pic] is pronounced “x-bar”) |
| | |
| | |
| |The mean is a good way to measure the center when the shape of your distribution is unimodal and symmetric. Because the|
| |mean cannot resist the influence of extreme observations, like outliers, it is not a resistant measure. So use caution |
| |if such values are present or when your distribution is skewed. |
| | |
|1.3 Describing Quantitative Data with | |
|Numbers |The median M is the midpoint of a distribution, the number such that half of the observations are smaller and the other |
| |half are larger. |
| |To find the median of a distribution: |
| |Arrange all observations from smallest to largest. |
|Measuring Center: The Mean ([pic]) |If the number of observations n is odd, the median M is the center observation in the ordered list. |
| |If the number of observations n is even, the median M is the average of the two center observations in the ordered list.|
| | |
|The Mean ([pic]) |The median is resistant to outliers. The median is a better measure of the center when outliers are present or when |
| |your distribution is skewed. |
| | |
| | |
| | |
| |The mean and median of a roughly symmetric distribution are close together. |
| |If the distribution is exactly symmetric, the mean and median are exactly the same. |
| |In a skewed distribution, the mean is usually farther out in the long tail than is the median. |
| | |
| |Check Your Understanding page 55 |
| | |
|Measuring Center: The Median (M) | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Comparing the Mean and Median | |
| | |
| | |
|Example | |
| | |
| | |
| |Range = Maximum value – Minimum Value |
| |The range shows the full spread of the data. But it depends only on the smallest and largest values, which could be |
| |outliers. We can improve our description of spread by also looking at the spread of the middle observations. |
| | |
| | |
| |The quartiles mark of the middle half of the data. To calculate the quartiles |
| |Arrange the observations in increasing order and locate the median. |
| |[pic], the 25th percentile, is the value such that 25% of the data values are less than it. To find [pic], find the |
| |median of the first half of the data. |
| |[pic], the 75th percentile, is the value such that 75% of the data values are less than it. To find [pic], find the |
| |median of the second half of the data. |
|Measuring Spread: The Interquartile Range |Note: If you have an even number of observations, just take the average of the middle two numbers. |
|(IQR) | |
| |The interquartile range (IQR) covers the range of the middle 50% of data. Because it doesn’t use extreme values, it is |
|The Range |resistant to outliers. Use the IQR as your measure of spread when outliers are present or if data are skewed. When |
| |using the median as your measure of center, use the IQR as your measure of spread. |
| |IQR = [pic]-[pic] |
| | |
| |See Example page 57 |
| |[pic] |
|The Quartiles ([pic]) |IQR = Q3 – Q1 |
| |= 42.5 – 15 |
| |= 27.5 minutes |
| | |
| |Interpretation: The range of the middle half of travel times for the New Yorkers in the sample is 27.5 minutes. |
| | |
| | |
| |The 1.5 x IQR Rule for Outliers |
| |You can use the IQR to find outliers. Call an observation an outlier if it falls more than 1.5 ( IQR above [pic] or |
|interquartile range (IQR) |below [pic]. |
| | |
| |In the New York travel time data, we found Q1=15 minutes, Q3=42.5 minutes, and IQR=27.5 minutes. |
| |For these data, 1.5 x IQR = 1.5(27.5) = 41.25 |
| |Q1 - 1.5 x IQR = 15 – 41.25 = -26.25 |
| |Q3+ 1.5 x IQR = 42.5 + 41.25 = 83.75 |
| |Any travel time shorter than -26.25 minutes or longer than 83.75 minutes is considered an outlier. |
| | |
| | |
| |Minimum [pic] M [pic] Maximum |
| | |
| |Regular boxplots conceal outliers so we will use the modified boxplots because it plots outliers. |
| | |
| | |
| |Check Your Understanding page 55 |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Identifying Outliers | |
| | |
| | |
| | |
| | |
|Example | |
| | |
| | |
| | |
| |See Technology Corner page 61 |
| | |
| | |
| |The standard deviation sx measures the average distance of the observations from their mean. It is calculated by finding|
|Five Number Summary |an average of the squared distances and then taking the square root. This average squared distance is called the |
| |variance. |
| |Consider the following data on the number of pets owned by a group of 9 children. |
| | |
| |See Example page 62 |
| | |
|Example |1) Calculate the mean. |
| |2) Calculate each deviation. |
| |deviation = observation – mean |
| | |
| | |
| | |
| | |
| | |
| | |
| |3) Square each deviation. |
| |4) Find the “average” squared deviation. |
| |Calculate the sum of the squared deviations |
| |divided by (n-1)…this is called the |
| |variance. |
| |5) Calculate the square root of the variance… |
| |this is the standard deviation. |
| | |
| |“average” squared deviation = 52/(9-1) = 6.5 |
|Construct Calculator Boxplots |This is the variance. |
| | |
|Measuring Spread: The Standard Deviation |Standard deviation = square root of variance |
|([pic]) |= [pic] |
| | |
| |[pic] |
| | |
| |The variance (s²) of a set of observations is the average of the squares of the deviations of the observations from |
| |their mean. Because its formula contains the mean, the s.d. is not resistant to outliers. When using the mean as your |
| |measure of the center, you should use the s.d. as your measure of spread. |
| | |
| |[pic] |
| | |
| | |
| |See Technology Corner page 65 |
| | |
| | |
| | |
| |The median and IQR are usually better than the mean and standard deviation for describing a skewed distribution or a |
| |distribution with outliers. |
| |Use mean and standard deviation only for reasonably symmetric distributions that don’t have outliers. |
| |NOTE: Numerical summaries do not fully describe the shape of a distribution. ALWAYS PLOT YOUR DATA! |
| | |
| | |
| |See Example page 66-67 |
| | |
| | |
| | |
|Standard Deviation ([pic]) | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Variance ([pic]) | |
| | |
| | |
| | |
|Computing Numerical Summaries on | |
|Calculator | |
| | |
| | |
|Choosing Measures of Center and Spread | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
|Summary |
-----------------------
DiceRolls
0
2
4
6
8
10
12
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related searches
- doha academy qatar
- khan academy statistics
- khan academy significant figures chemistry
- khan academy order of operations
- khan academy numbers and operations
- khan academy number theory
- khan academy contact us
- khan academy contact number
- khan academy significant figures practice
- khan academy significant numbers
- american academy school doha
- academy of water education