Home / Homepage



Lesson 16.1: Types of StudiesLearning Goals: What are the different statistical studies and their characteristics?What are some advantages and disadvantages of each type of study?In a statistical study, data are collected and used to answer questions about a populaton characteristic or parameter.Due to time and money contraints, it may be impractical or impossible to collect data from each member of a population. Therefore, in many studies, a sample of the population is taken, and a measure called a statistic is calculated using data.The sample statistics is then used to make inferences about the population parameter.The steps in a typical statistical study are shown below:-457200281305Identify the objective of the studyChoose a sample, and collect dataOrganize the data, and calculate sample statisticsMake inferences and draw conclusions about the population.Identify the objective of the studyChoose a sample, and collect dataOrganize the data, and calculate sample statisticsMake inferences and draw conclusions about the population.Study TypesThe following study types can be used to collect sample information.Study TypeDefinitionExampleSurveyIn a survey, data are collected from responses given by members of a population regarding their characteristics, behaviors, or opinions.To determine whether the student body likes the new cafeteria menu, the student council asks a random sample of students for their opinions.ExperimentCause and effect relationshipIn an experiment, the sample is divided into two groups:An experimental group that undergoes a change, andA control group that does not undergo the change.The effect on the experimental group is then compared to the control group.A restaurant is considering creating meals with chicken instead of beef. They randomly give half of a group of participants meals with chicken and the other half meals with beef. Then they ask how they like the meals.Observational StudyIn an observational study, members of a sample are measured or observed without being affected by the study.Researchers at an electronics company observe a group of teenagers using using different laptops and note their reactions.Example 1: Classify Study TypesDetermine whether each situation describes a survey, an experiment, or an observational study. Then identify the sample and suggest a population from which it may have been selected.a. A record label wants to test three designs for an album cover. They randomly select 50 teenagers from local high schools to view the covers while they watch and record their reactions.Observation!Sample = 50 teenagersPopulation = teenagers in high schoolb. The city council wants to start a recycling program. They send out a questionnaire to 200 random citizens asking what items they would recycle.Survey!Sample = 200 citizensPopulation = citizens in the cityc. Scientists study the behavior of one group of dogs given a new heartworm treatment and another group of dogs given a false treatment or placebo.Experiment!Sample = two groups of dogsPopulation = all dogsExample 2: Choose a Study TypeDetermine whether each situation calls for a survey, an experiment, or an observational study. Explain your reasoning.a. A pharmaceutical company wants to test whether a new medicine is effective. Experiment because you are going to have to test the medicine!b. A news organization wants to randomly call citizens to gauge opinions on a presidential election. Survey because you want to ask and find out people’s opinions!c. A research company wants to study smokers and nonsmokers to determine whether 10 years of smoking affects lung capacity. Observation because you are only going to choose people who already smoke or already do not smoke!Biased SamplesTo obtain good information and draw accurate conclusions about a population, it is important to select an unbiased sample.A bias is an error that results in a misrepresentation of members of a population.A poorly chosen sample can cause biased results.To reduce the possibility of selecting a biased sample, a random sample can be taken, in which members of the population are selected entirely by chance.The following conditions should exist when collecting data to eliminate any bias:The sample must be representative of the group being studiedThe sample must be large enough to be effectiveThe selection should be randomThe questions chosen for a survey or procedures used in an experiment also need to be clear and precise in order to avoid bias. Avoid survey questions that:Are confusing or wordyCause a strong reactionEncourage a certain responseAddress more than one issueExample 3: Identify Bias in Survey Questions: Determine whether each survey question is biased or unbiased. If biased, explain your reasoning.a. Don’t you agree that the cafeteria should serve healthier food? Biased because it is encouraging a certain response!b. How often do you exercise? Unbiased because it is not leading the person to answer a certain way!Example 4: Identify Bias in a Study: Determine whether each sample study is biased or unbiased. If biased, explain your reasoning.a. A legislator wishes to know how his district feels about a particular issue. As a result, his office e-mails a detailed survey about the issue to a random sample of adults in his district. Biased because only people who feel strong enough about the issue will respond!b. A town council wants to know whether residents support having an off-leash area for dogs in the town park. Eighty dog owners are surveyed at the park. Biased because what about people that do not own dogs?c. All 250 students at a review session are given numbered tickets. Five numbers are chosen randomly, and the individuals with the winning ticket numbers each win a $10 gift card. Unbiased!d. A news show asks viewers to call a toll-free number to express their opinions about their choice for president. Biased!e. A teacher asks high school students how often they drink alcohol. Biased!1. A survey completed at a large university asked 2000 students to estimate the average number of hours they spend studying each week. Every tenth student entering the library was surveyed. The data showed that the mean number of hours that students spend studying was 15.7 per week. Which characteristic of the survey could create a bias in the results?(1) the size of the sample(2) the size of the population(3) the method of analyzing the data(4) the method of choosing the students who were surveyed2. A survey is conducted about people’s favorite professional sports teams. Which of the following survey methods would most likely produce a random sample?(1) Asking every 10th person walking into Yankee stadium(2) using a random number generator based on those who bought tickets to last week’s Jets game(3) Calling 1000 randomly selected phone numbers from around the country(4) Sending out a survey to those who subscribe to Sports Illustrated3. Rebecca is conducting a statistical study for psychology class. She asks a group of students to do a timed arithmetic test. Then she has a similar group of students take the timed arithmetic test after they eat peppermint candy. She compares the scores to see if eating peppermint improves math skills. State whether Rebecca’s investigation was an example of a controlled experiment, an observation or a survey. Justify your response.Controlled experiment because you have 2 groups where one group is being manipulated.4. It has been decided that 100 people need to be surveyed to decide the public’s opinion on a school building project. A student suggests that they survey the first 100 people who enter the school. Do you think that this proposed way of sampling is an unbiased way to perform the survey? Justify your answer.Homework 16.1: Types of Studies1. For each of the following study descriptions, identify whether the study is a survey, an observational study, or an experiment, and give a reason for your answer. Then identify the sample, and suggest a population from which it may have been selected.a. The National Highway Traffic Safety Administration conducts annual studies on drivers’ seatbelt use at a random selection of roadway sites in each state in the United States. To determine if seatbelt usage has increased, data are analyzed over two successive years.b. People should brush their teeth at least twice a day for at least two to three minutes with each brushing. For a statistics class project, you ask a random number of students at your school questions concerning their tooth brushing activities.c. A study determines whether taking aspirin regularly helps to prevent heart attacks. A large group of male physicians of comparable health were randomly assigned equally to taking an aspirin every second day or taking a placebo. After several years, the proportion of the study participants who had suffered heart attacks in each group was compared.2. Determine whether each situation calls for a survey, an experiment, or an observational study. Explain your reasoning.a. A grocery store conducts an online study in which customers are randomly selected and asked to provide feedback on their shopping experience.b. A research group randomly selects 80 college students, half of whom took a physics course in high school, and compares their grades in a college physics course.c. A research group randomly chooses 100 people to participate in a study to determine whether eating blueberries reduces the risk of heart disease for adults.3. Determine whether each survey question is biased or unbiased. If biased, explain your reasoning.a. How long have you lived at your current address?b. Which is your favorite football team, the Dallas Cowboys or the Pittsburgh Steelers?4. Explain why each given sample is biased.a. A sportswriter wants to determine whether baseball coaches think wooden bats should be mandatory in collegiate baseball. The sportswriter mails surveys to all collegiate coaches and uses the surveys that are returned.b. Every tenth employee who arrives at a company health fair answers a survey that asks for opinions about new health-related programs.5. The manager of food services at a local high school is interested in assessing student opinion about a new lunch menu in the school cafeteria. The manager is planning to conduct a sample survey of the student population. Which of the following methods of sample selection would be most effective at reducing bias?(1) Randomly select one day of the week and then select the first 30 surveys that are submitted(2) Post the survey on the school website and use the first 30 surveys that are submitted.(3) Randomly select 30 students from a list of all the students in the school.(4) Randomly select one classroom in the school and then select the first 30 students who enter that classroom.6. A human resources director of a large company is interested in how often employees use their computers during breaks. She watches a selected group of employees at their desks during the break times. State whether the human resource director’s investigation was an example of an experiment, an observation or a survey. Justify your response.Lesson 16.2: Designing a StudyLearning Goals: What is randomization and why is it important? What is the difference between random selection and random assignment?How do we design a study?Warm-Up: Two studies are described below. One is an observational study, while the other is an experiment.Study A: A new dog food, specially designed for older dogs, has been developed. A veterinarian wants to test this new food against another dog food currently on the market to see if it improves dogs’ health. Thirty older dogs were randomly assigned to either the “new” food group or the “current” food group. After they were fed either the “new” or “current” food for six months, their improvement in health was rated. ExperimentStudy B: The administration at a large school wanted to determine if there was a difference in the mean number of text messages sent by ninth-grade students and by eleventh-grade students during a day. Students in a random sample of 30 ninth-grade students were asked how many text messages they sent per day. Students in another random sample of 30 eleventh-grade students were asked how many text messages they sent per day. The difference in the mean number of texts per day was determined. Observational Study1. Which study can you show a cause-and-effect relationship? Explain your thinking. Study A because it is an experiment!2. Which study is the experiment? Justify your answer. Study A – subjects are being split into two groups. (experimental and control groups)3. In your own words, describe what a subject is in an experiment. Participants being tested.4. In your own words, describe what a response variable is in an experiment. The outcome of the test.5. In your own words, describe what a treatment is in an experiment. What you do to each group.RandomizationToday we will be designing two kinds of studies: observational study and an experimentIn an observational study, people choose their own actions and scientists observe what they do. Therefore, you cannot show a cause-and-effect relationship using an observational study.In an experiment, the scientist assigns treatments to subjects in different groups and then compares the groups to determine if there is any difference in a specific variable of interest (called a response). A well designed experiment allows researchers to decide if there is a cause-and-effect relationship.Although a cause-and-effect relationship can only be determined with an experiment, randomization plays an important part in data collection for both study types.Take another look at the two studies from the warm-up exercise.Study A (Experiment)A new dog food, specially designed for older dogs, has been developed. A veterinarian wants to test this new food against another dog food currently on the market to see if it improves dogs’ health. Thirty older dogs were randomly assigned to either the “new” food group or the “current” food group. After they were fed either the “new” or “current” food for six months, their improvement in health was rated.Study B (Observation)The administration at a large school wanted to determine if there was a difference in the mean number of text messages sent by ninth-grade students and by eleventh-grade students during a day. Students in a random sample of 30 ninth-grade students were asked how many text messages they sent per day. Students in another random sample of 30 eleventh-grade students were asked how many text messages they sent per day. The difference in the mean number of texts per day was determined.In Study A (experiment) the dogs were randomly assigned to the two treatment groups. If this was not done, the veterinarian might inadvertently create a situation where any observed difference between the groups may have been caused by some factor other than the treatments they were interested in.In Study B (observation) the students were randomly selected to represent their greater population. If this was not done, the administration might introduce favoritism in the selection which could potentially distort conclusions drawn from the data.As you can see, terms random selection and random assignment have very different meanings.In your own words, summarize the terms:Random Selection: randomly choosing the subjects in a study.Random Assignment: randomly placed into one of two groups.Which type of randomization can result in a cause-and-effect relationship? Why? Random assignment – must have an experiment.The table below summarizes the differences between the terms random selection and random assignment. For each statement, put a check mark in the appropriate column(s), and explain your choices.Random SelectionRandom AssignmentUsed in Experiments√√Used in Observational Studies√Allows Generalization to the Population√√Allows a Cause-and-Effect Conclusion√1. A group of 600 registered voters in a given county are asked how they intend to vote in an upcoming election. The 600 voters submitted their opinions via a web page and volunteered to participate after seeing an advertisement on television. A summary of their responses is posted on a news web-site, and it is implied that this group is representative of all registered voters in that county. The responses are used to predict the outcome of the election. Explain why the data collection method might cause a problem.2. Robin read somewhere that adding salt to water while heating it will raise the temperature of the water causing it to boil faster. To test this claim, she filled 30 identical pots with one quart of water. She randomly selected 15 of the pots and added 1 teaspoon of salt. She then placed each pot on identical burners set to the highest setting. She measured the water temperature in each pot after 5 minutes. If Robin does find that there is a difference between water temperatures in the pots with salt compared to those without, can she conclude that the salt caused the difference in temperature?Designing an ExperimentIn a controlled experiment, two groups are studied under identical conditions with the exception of one variable. The group under ordinary conditions that is subjected to no treatment is the control group. The group that is subjected to the treatment is the treatment group.In order to design an experiment, one mustExplain how subjects will be randomly assigned to the groups.Explain what will happen in the two groups.State that you will compare the results.When designing an experiment, participants must be randomly assigned or selected to their groups in order to avoid any bias. Below are some common procedures that have been used:Flip a coin (i.e., heads-control, tails-treatment)Deck of cards (i.e., even-control, odd-treatment)Throwing a dice (i.e., below and equal to 3-control, over 3-treatment)Picking numbers from a hat( i.e., first half picked is control group, second half is treatment group)Random Number Generator (This is the easiest/best way to randomly assign groups)Model Problem: Design an experiment53340-24003000A research company wants to test the claim that the plant fertilizer Thrive can make tomato plants grow taller. Describe how a controlled experiment can be created to examine the effect of Thrive in a tomato plant.Randomly assign 100 tomato plants to two groups. Give each plant a number, and numbers will be picked out of a hat. First 50 numbers will be the experimental group, and the next 50 will be the control group. The experimental group will receive the Thrive fertilizer and the control group will receive normal pare the results of the tomato plant heights.1. A company wants to determine whether wearing a new tennis shoe improves jogging time. State the objective of the experiment, suggest a population, determine the experiment and control groups, and describe a sample procedure.Design an experiment!Randomly assign – pick numbers out of a hatControl – regular shoesTreatment – new tennis shoesRandomly assign 100 joggers a number. Use a random number generator to assign each person a number. First 50 will be the experimental group and will receive the new tennis shoes, the next 50 will be the control group and will use a normal tennis shoe. Compare the jogging times of the two groups.2. Jennifer goes to a basketball clinic every Tuesday. She is watching fifth grade athletes with the goal of determining if the amount of time they spend shooting before the clinic has an impact on the number of shots they make in the game at the end of the clinic. How could the following observational study be converted into a controlled experiment?Randomly assign!Control – doesn’t shoot beforeTreatment – will shoot before3. Identify any flaws in the design of the experiment, and describe how they could be corrected. An electronics company wants to test whether using a new graphing calculator increases students’ tests scores. A random sample is taken. Calculus students in the experimental group are given the new calculator to use, and Algebra 2 students in the control group are asked to use their own calculator.There’s no random assignment! Each group should have a mix of the two courses.Homework 16.2: Designing a Study1. Describe how a controlled experiment can be created to examine the effect of ingredient X in a toothpaste.2. Identify any flaws in the design of the experiment below, and describe how they could be corrected:A supermarket chain wants to determine whether shoppers are more likely to buy sunscreen if it is located near the checkout line. The experimental group consists of a group of stores in the Midwest in which the sunscreen was moved next to the checkout line, and the control group consists of stores in Arizona in which the sunscreen was not moved.3. An orange-juice processing plant receives a truckload of oranges. The quality control team randomly chooses three pails of oranges, each containing 50 oranges, from the truckload. Identify the sample and the population in the given scenario.4. A survey is to be conducted in a small upstate village to determine whether or not local residents should fund construction of a skateboard park by raising taxes. Which segment of the population would provide the most unbiased responses?(1) a club of local skateboard enthusiasts(2) senior citizens living on fixed incomes(3) a group opposed to any increase in taxes(4) every tenth person 18 years of age or older walking down Main St.5. Which task is not a component of an observational study?(1) The researcher decides who will make up the sample.(2) The researcher analyzes the data received from the sample.(3) The researcher gathers data from the sample, using surveys or taking measurements.(4) The researcher divides the sample into two groups, with one group acting as a control group.6. A school cafeteria has five different lunch periods. The cafeteria staff wants to find out which items on the menu are most popular, so they give every student in the first lunch period a list of questions to answer in order to collect data to represent the school. Which type of study does this represent?(1) observation (2) controlled experiment (3) population survey (4) sample survey7. High school officials wanted to assess the need for a new diving board. They created a survey and distributed it to a large, diverse crowd at the State Swim Meet held at their school. Which characteristic of the survey is most likely to create a bias?(1) the number of participants(2) the height of the participants(3) the way the set of data from the survey was analyzed(4) the way the participants were selected to take the survey8. Which of these questions is a biased question?(1) Do you prefer yogurt or pudding for dessert?(2) Do you prefer to sit on the couch and watch TV or do you like to exercise and stay in shape?(3) What sport do you play?(4) What is your favorite food?9. Which of the following methods should a researcher use to study the effect on a baby whose mother used drugs during her pregnancy?(1) experiment (2) census (3) sample survey (4) observational study10. Which statement about statistical analysis is false?(1) Experiments can suggest patterns and relationships in data.(2) Experiments can determine cause and effect relationships.(3) Observational studies can determine cause and effect relationships.(4) Observational studies can suggest patterns and relationships in data.Lesson 16.3: Quantifying DataLearning Goals:What are the measures of central tendency?What are some measures we use to determine the spread of data?What is a relative frequency histogram?How can you describe data distributions in terms of shape, center, and variability?How do we use a curve to model data distribution?Do Now: Use the following sample data to answer the questions below.a. What is the mean of each sample data? Average! 8b. What is the median of each sample data? Middle! 8c. What is the mode of each sample data Data that occurs the most! 8d. What is the range of each sample data? Max – Min A: 12-4=8 B: 10-6=4 Important vocabulary to remember:Mean: The average of the data values. Add up all the numbers and then divide by the number of numbers you have.Median: The “middle” value in the list of numbers. To find the median, your numbers have to be listed in numerical order, so you may have to rewrite your list first. If there are an even amount of numbers, take the mean of the middle numbers to get the median.Mode: The value that occurs most often. If no number is repeated, then there is no mode for the list.Range: The difference in the highest and lowest values.Standard Deviation: One measure of spread or variability in a data distribution. The standard deviation describes variation in terms of deviation from the mean. The standard deviation can be interpreted as a typical deviation from the mean. A small standard deviation indicates that the data points tend to be very close to the mean, and a large standard deviation indicates that the data points are spread out over a large range of values.-76200125730What does Standard Deviation tell us?Standard deviation is a measure of the variation (spread) of data.Big standard deviation: a lot of variation (data is spread out far from the center)Small/low standard deviation: not much variation (data is close to the center)Zero standard deviation: no variation (all data is the same as the mean)0What does Standard Deviation tell us?Standard deviation is a measure of the variation (spread) of data.Big standard deviation: a lot of variation (data is spread out far from the center)Small/low standard deviation: not much variation (data is close to the center)Zero standard deviation: no variation (all data is the same as the mean)Let’s re-examine the two sample data sets from the Do Now. This time, we will look at them as a histogram instead of a list of numbers.A histogram is a statistical graph that represents the frequency of values of a quantity by vertical rectangles of varying heights.Frequency represents the number of times a data value occurs.Based on the definitions above, which sample has the larger standard deviation, Justify your answer. Sample A because the data is more spread out from the mean (8) 3676650153543000Analyzing Distributions: A distribution of data shows the observed or theoretical frequency of each possible data value. Analyzing the shape of a distribution can help you decide which measure of center or spread best describes a set of data. The shape of the distribution for a set of data can be seen by drawing a curve over its histogram.Skewed LeftNormal DistributionSkewed Rightmean < medianmean = medianmean > medianmajority of data to thedata are evenly distributedmajority of data to theright of meanon both sides of meanleft of meanIn any type of distribution, the mode is the most common number and it matches with the highest peak.When choosing the appropriate statistics to represent a set of data, first determine the skewness.When a distribution is symmetric, the mean and standard deviation accurately reflect the center and spread of the data.When a distribution is skewed or has outliers, the mean and standard deviation become less reliable1. Have you ever noticed how sometimes batteries seem to last a long time, and other times the batteries seem to last only a short time? The histogram below shows the distribution of battery life (hours) for a sample of 40 batteries of the same brand. When studying a distribution, it is important to think about the shape, center, and spread of the data.-952510160000334645010985500a. Would you describe the distribution of battery life as approximately symmetric or as skewed? Explain your answer. It is approximately symmetric because it is a normal distribution.b. Is the mean of the battery life distribution closer to 95, 105, or 115 hours? Explain your answer. 105, closest to the middle of the curve.c. Consider 5 hours as an estimate of the standard deviation. Is it a reasonable description of a typical distance from the mean? Explain your answer. No, 100-110 doesn’t work!d. Consider 10 hours as an estimate of the standard deviation. Is it a reasonable description of a typical distance from the mean? Explain your answer. Yes, 95-115=10! So it is 68% of the population.e. Consider 25 hours as an estimate of the standard deviation. Is it a reasonable description of a typical distance from the mean? Explain your answer. No, 85-130 doesn’t work!2. The histogram below shows the distribution of the greatest drop (in feet) for 55 major roller coasters in the United States.3724275000a. Would you describe this distribution of roller coaster maximum drop as approximately symmetric or as skewed? Explain your answer. Skewed right because there is less data to the right of the mean. b. Is the mean of the maximum drop distribution closer to 90, 135, or 240 feet? Explain your answer. 135, check the drawing above!c. Is the standard deviation of the maximum drop distribution closer to 40, 70, or 100 hours? Explain your answer. Standard deviation should not be used in this example because it is skewed to the right!3. Consider the following histograms: Histogram 1, Histogram 2, Histogram 3, and Histogram 4. Descriptions of four distributions are also given. Match the description of a distribution within the appropriate histogram.HistogramDistribution1B2A3C4DDescription of distributions:DistributionShapeMeanStandard DeviationASkewed to the right10010BApproximately symmetric, mound shaped10010CApproximately symmetric, mound shaped10040DSkewed to the right10040Example 1: Jason’s first three test grades for Algebra 2 are 78, 82, and 74. What grade would John need to get on his fourth test to make his overall average an 80?Mean=sum of all data# of data values 72+82+74+x4=80234+x4=80 234+x=320 x=86 Example 2: A survey was taken in biology class regarding the number of siblings of each student. The table shows the class data with the frequency of responses. The mean of this data is 2.5. Find the value of k in the table.Mean=sum of all data# of data values2.5=15+2k+38+44+5(1)5+k+8+4+12.51=2k+5018+k45+2.5k=2k+50.5k=5k=10Example 3: The table displays the frequency of scores on a twenty point quiz. The mean of the quiz scores is 18. Find the value of k in the table.Mean=sum of all data# of data values18=152+164+177+1813+19k+20(5)2+4+7+13+k+5181=19k+54731+k558+18k=19k+547-1k=-11k=11Homework 16.3: Quantifying Data2714625271780001. The histogram below shows the distribution of gasoline tax per gallon for the 50 states and the District of Columbia in 2010. Describe the shape of this distribution.2590800302895002. The histogram below shows the distribution of the number of automobile accidents per year for every 1,000 people in different occupations. Describe the shape of this distribution.3. For each of the following, match the description of each distribution with the appropriate histogram:4. Periodically the U.S. Mint checks the weight of newly minted nickels. Below is a histogram of the weights (in grams) of a random sample of 100 new nickels.a. The mean and standard deviation of the distribution of nickel weights are 5.00 grams and 0.06 grams, respectively. Mark the mean on the histogram. Mark one standard deviation above the mean and one standard deviation below the mean.b. Describe the shape of the distribution.6. The number of minutes students took to complete a quiz is summarized in the table below. If the mean number of minutes was 17, what is the value of x?7. Julie averaged 85 on the first three tests of the semester in her mathematics class. If she scores 93 on each of the remaining tests, her average will be 90. Which equation could be used to determine how many tests, T, are left in the semester?(1) 255+93T3T=90 (2) 255+90T3T=93 (3) 255+93TT+3=90 (4) 255+90TT+3=93Lesson 16.4: The Normal Distribution and z-ScoresLearning Goals:What are the characteristics of normal distribution?What is a z-score?How do we determine probability using normal distribution?Normal DistributionDistributions of mileages of different sample sizes of cars are shown below. As the sample size increases, the distributions become more and more symmetrical and resemble the curve at the right.The curve on the right is a NORMAL DISTRIBUTION, a continuous, symmetric, bell-shaped distribution of a random variable. The following is a list of characteristics of the normal distribution curve:The graph of the curve is continuous, bell-shaped, and symmetric with respect to the mean.The mean, median, and mode are equal and located at the center.The curve approaches, but never touches, the x-axis.The total area under the curve is equal to 100%.The area under the curve represents the amount of data within a certain interval or the probability that a random data value falls within the interval. The Empirical Rule can be used to determine the area under the normal curve at specific intervals.The Empirical RuleApproximately 68.2% of the data fall within 1 standard deviation of the mean. σ=standard deviationApproximately 95.4% of the data fall within 2 standard deviations of the mean.401002551498500Approximately 99.8% of the data fall within 3 standard deviations of the mean.-10477514859000The Empirical Rule only helps us if we are evaluating values that are a specific whole number of standard deviations away. But what if we are 1.2 standard deviations away from the mean?-38100855980The Z-Score for a data value x in a set of normally distributed data is given by z=x-μσ, where μ is the mean and σ is the standard deviation.00The Z-Score for a data value x in a set of normally distributed data is given by z=x-μσ, where μ is the mean and σ is the standard deviation.We can standardize any set of data by converting them to z-scores. The z-score represents the number of standard deviations that a given data value is from the mean.1. A normal distribution had a mean of 21 and a standard deviation of 4.a. Find the range of values that represent the middle 68% of the distribution.middle 68% means one σ away!68% lie between 17 to 25M=21, σ=4, so really means 21±4 b. What percent of the data will be greater than 29?1.7%+.5%+.1%=2.3% Where is 29 located on the x-axis?2. The heights of 1800 adults are normally distributed with a mean of 70 inches and a standard deviation of 2 inches.a. About how many adults are between 66 and 74 inches?Where are between 66 and 74 located on the x-axis?95.4% of 1800=1717 or 95.4100=x1800 x=1717b. What is the probability that a random adult is more than 72 inches tall?9.2%+4.4%+1.7%+.5%+.1%=15.9% 3. The prices of the printers in a store have a mean of $240 and a standard deviation of $50. The printer that you eventually choose costs $340.480060024892000a. What is the z-score for the price of your printer?x=340 μ=240 z=x-μσ=340-24050=2σ=50 b. How many standard deviations above the mean was the price of your printer?$340 was 2σ above the mean!4. Andrew’s height is 63 inches. The mean for boys at his high school is 68.1 inches, and the standard deviation of the boys’ heights is 2.8 inches.a. What is the z-score for Andrew’s height? (Round your answer to the nearest hundredth.) z=x-μσ=63-68.12.8=-5.12.8=-1.82b. What is the meaning of this value? Andrew’s height is 1.82 σ below the mean!How can we use z-scores to determine the probability?Determine the z-scores for each of the boundary points given (round to 4 decimal places).If the curve continues to the left, use -1×1099 as your lower z-score.If the curve continues to the right, use 1×1099 as your upper z-score.Use your graphing calculator to calculate the probability:On your paper you must write: Normal cdf (lower z-score, upper z-score)CHEAT: Normal cdf (lower z-score, upper z-score, mean, standard deviation)5. A swimmer named Amy specializes in the 50-meter backstroke. In competition, her mean time for the event is 39.7 seconds, and the standard deviation of her times is 2.3 seconds. Assume that Amy’s times are approximately normally distributed. Solve the following, using z-scores, a graphing calculator, and rounding your answers to the nearest thousandth.a. Find the probability that Amy’s time in her next race is between 37 and 44 seconds.μ=39.7 Find a z-score for each bound!σ=2.3 x1=37 (Lower Bound)x2=44 (Upper Bound)z=37-39.72.3=-1.173913043 z=44-39.72.3=1.869565217 normal cdf(-1.173913043, 1.869565217)=0.0894429407≈.08954006758331200005514975122872500b*. Find the probability that Amy’s time in her next race is more than 45 seconds.μ=39.7 Find a z-score for each bound!σ=2.3 x1=45 (Lower Bound)z=1×1099 (Upper Bound)z=45-39.72.3=2.304347826 normal cdf(2.304347826, 1×1099)=0.0106015351≈0.011c. Find the probability that Amy’s time in her next race is less than 36 seconds.6. The shelf life of a particular snack chip is normally distributed with a mean of 173.3 days and a standard deviation of 23.6 days.a. To the nearest tenth, what percent of the product lasts between 150 and 200 days?x1=150 (Lower Bound)x2=200 (Upper Bound)z=150-173.323.6=-.9873 z=200-173.323.6=1.1314 normal cdf(-.9873 , 1.1314) b. To the nearest tenth, what percent of the product lasts more than 225 days?c. There are 150 bags of this particular snack chip in a grocery store. About how many of them can you expect to last more than 225 days?Homework 16.4: The Normal Distribution and z-scores1. The scores on a standardized test are normally distributed with a mean of 560 and a standard deviation of 75. Find, to the nearest tenth of a percent, the probability that a test picked at random would have a score larger than 720.2. Given that the volume of soda in a 12 ounce bottle from a factory varies normally with a mean of 12.2 ounces and a standard deviation of 0.6 ounces, use your calculator to determine the probability that a bottle chosen at random would have a volume of at most 11 ounces. Round to the nearest thousandth.3. Reaction times of human beings are normally distributed with a mean of 0.76 seconds and a standard deviation of 0.06 seconds.a. Calculate the probability of people that have a reaction time greater than 0.70 seconds, to the nearest hundredth.b. Out of the 220 students that take Algebra 2, about how many of them can be expected to have a reaction time greater than 0.70 seconds?4. Suppose that a particular medical procedure has a cost that is approximately normally distributed with a mean of $19,800 and a standard deviation of $2,900. a. For a randomly selected patient, find the probability that the procedure costs between $18,000 and $22,000.b. Consider the medical procedure described above, and suppose a patient is charged $24,900 for the procedure. The patient is reported as saying, “I’ve been charged an outrageous amount!” How justified is this comment? Use the normal curve and probability to support your answer.Lesson 16.5: Normal Distribution and Z-Scores Day 2(SKIP?)Learning Goal: How can we determine the probability of an event using z-scores and the normal curve?241808025463500Understanding the Normal Curvea. Which curve has the greater mean? How can you tell? Sample B! The peak is higher on the x-axis.b. Which curve has the greater standard deviation? How can you tell? Sample B! Data is more spread out from the mean.c. Jan and Joe each grow apples. Jan’s apples have a mean circumference of 5 inches and a standard deviation of 1 inch while Joe’s apples have a mean circumference of 9 inches and a standard deviation of 2 inches. If Jan brought an apple with circumference of 8 inches and Joe brought an apple with a circumference of 13 inches, who brought a more unusual apple, as related to their own crops? Explain your reasoning. Jan because it is further away from the mean. Less data is there!Using Z-Scores to Compare Normal Distributions:1. a. Let’s say our Unit 15 Algebra 2 test was normally distributed. The mean was 75 with a standard deviation of 4. Determine the probability that you scored between an 80 and 85, to the nearest thousandth. Find a z-score for each bound!μ=75 & σ=4 x1=80 (Lower Bound)x2=85 (Upper Bound)z=80-754=1.25 z=85-754=2.5 normal cdf(1.25, 2.5)≈.099b. Suppose our next test was also normally distributed with a mean of 84 and a standard deviation of 2. What interval would you have to score between to do as well as you did on the Unit 15 test? Find data values x!μ=84 & σ=2 x1= (Lower Bound)x2= (Upper Bound)1.25=x1-842 2.5=x2-842 2.5=x1-84 5=x2-84x1=86.5 x2=89Interval is 86.5-892. A golfer plays on two different golf courses. Sunny Pines has a mean score of 72.1 with a standard deviation of 3.3. The Links at the Shore has a mean score of 70.5 with a standard deviation of 1.5. If the golfer shot 68 at both courses, which one was more impressive? Find a z-score!μ=72.1, σ=3.3, score=68 μ=70.5, σ=1.5, score=68 z=68-72.13.3=-1.24 z=68-70.51.5=-1.6 Links score is more impressive because it is further below the mean and less likely to occur…less data means less likely3. For the Geometry regents exam the results are normally distributed. On the June 2015 exam, the mean score was an 82 with a standard deviation of 5. On the January 2016 exam, the mean score was a 78 with a standard deviation of 3.a. Sarah took the June exam and scored in the interval 85 - 92. What is the probability, to the nearest thousandth, that a test paper selected at random from the June 2015 version scored in the same interval. Find the z-score!μ=82 & σ=5 z=85-825=0.6 z=92-825=2 Normalcdf0.6, 2=0.252b. Judy took the January exam. In what interval must Judy score to claim she scored as well as Sarah?μ=78 & σ=3 Find data values x!10.6=x1-783 2=x2-783 31.8=x1-78 6=x2-78x1=79.8 x2=84Interval is 79.8-844. Volt Batteries have an average battery life of 8.2 hours with a standard deviation of 0.7 hours. Electo Batteries have an average battery life of 8.6 hours with a standard deviation of 0.8 hours.457200031940500a. Determine the probability that a randomly selected Volt Battery will have a battery life of at least 7.5 hours.z=67.5-8.20.7=-1 normalcdf-1, 1×1099=0.8413447404b. In order to have the same probability as a Volt Battery, an Electro Battery must have a battery life of at least what value? At least 7.8Homework 16.5: Normal Distribution and Z-Scores Day 21. The number of visits to a gym per year by a sample of 522 members is normally distributed with a mean of 88 and a standard deviation of 19.a. About how many members went to the gym at least 50 times?b. What is the probability (to the nearest thousandth) that a member selected at random went to the gym more than 145 times?2. Suppose the weights of male basketball players at Syracuse University are normally distributed with a mean of 180 pounds and a standard deviation of 26 pounds. If a player is selected at random, what is the probability that the player will weigh more than 225 pounds, to the nearest ten-thousandth?3. Eels are washed onto a beach after a storm. Their lengths have a normal distribution mean of 41 cm and a standard deviation of 5.5 cm. (Round all answers to the nearest thousandth)a. If an eel is randomly selected, find the probability that it is at least 50cm long.b. Find the proportion of eels measuring between 40 and 50 cm long.c. How many eels from a sample of 200 would you expect to measure at least 50 cm in length?4. The results on the Spanish regents were normally distributed with a mean of 82 and a standard deviation of 2.5. The results on the French regents were normally distributed with a mean of 84 and a standard deviation of 3. If you scored between a 75 and 84 on the Spanish regents, what would someone need to score on the French regents to do as well as you? (Hint: find the z-scores for the Spanish regents first!)Lesson 16.6: Populations and SamplesLearning Goals:How do you differentiate between a population and a sample?What is the central limit theorem?Population and SampleWhen you think of a population and sample, you likely think only of people. Populations and samples are not always just composed of people. In biology, the subjects of interest could be plants or insects. In psychology, the subjects could be rats or mice. Television sets could be the subjects in a study to determine brand quality.Whether a set of people or objects is a population or a sample depends on the context of the situation. For instance, if the players on a specific baseball team were studied to determine the team’s most valuable player for that year, then that team’s players would be considered a population. There would be no need to generalize beyond that set of players. But in a study concerning the whole league, those players could be considered a sample.What is a population? It is the entire set of subjects in which there is an interest.What is a sample? It is a part of the population from which information (data) is gathered, often for the purpose of generalizing from the sample to the population.Exercise 1: Identify the population and the sample.a. In the United States, a survey of 2184 adults ages 18 and over found that 1328 of them own at least one pet. sample = 2184 adults surveyedpopulation = all adults 18 and over in the United Statesb. To estimate the gasoline mileage of new cars sold in the United States, a consumer advocacy group tests 845 new cars and finds they have an average of 25.1 miles per gallon.population = all new cars sold in the U.S.sample = 845 new cars testedb. A survey of 4464 shoppers in the United States found that they spent an average of $407.02 from Thursday through Sunday during a recent Thanksgiving holiday.population = all shoppers during Thanksgiving holiday U.S.sample = 4464 shoppers surveyed during Thanksgiving holidaySampling Variability752475104584500What makes learning from sample data a challenge is that different samples from the same population give different results. For example, if you average all the numbers from 1 to 99, you get 50. However, if you sample 10 randomly chosen numbers from 1 to 99 and then find the mean of those 10 numbers, you might not get the mean to be 50. In the table below are eight samples, each with 10 randomly chosen numbers from 1 to 99.Each of the 8 samples are called sample distributions.The value of the sample mean varies from sample to sample. Although the values in each sample differ, their means are clustered close to the actual mean of 50 (none is exactly 50). This is because of sampling variability.Central Limit TheoremWhat do you think will happen to the sampling distribution of the mean if we increase the sample size?To help answer this question, let us look at a coin tossing application. The number of coin tosses will be 50 per sample and we will toss 50 coins (n) number of times, and then record the mean and standard deviation. We will then increase the number of repetitions to try to see what happens to the spread of data and the measures of the mean and standard deviation.Summary: What happens to the sampling distribution if we increase the sample size?As the sample size (n) increases, the sample means tend to follow normal (symmetric) distribution.4219575000As the sample size (n) increases, the sample means tend to get closer to the true mean (actual mean).As the sample size (n) increases, the standard deviation tends to decrease.1. Below are three dot plots of the proportion of tails in 20, 60, or 120 simulated flips of a coin. The mean and standard deviation of the sample proportions are also shown for each of the three dot plots. Match each dot plot with the appropriate number of flips. Clearly explain how you matched the plots with the number of simulated flips.014287500497903540322500Homework 16.6: Populations and Samples-952517551401. A group of eleventh graders wanted to estimate the proportion of all students at their high school who suffer from allergies. Each student in one group of eleventh graders took a random sample of 20 students, while each student in another group of eleventh graders each took a random sample of 40 students. Below are the two sampling distributions (shown as histograms) of the sample proportions of high school students who said that they suffer from allergies. Which histogram is based on random samples of size 40? Explain.2. According to the 2009 Population Survey conducted by the U.S. Census Bureau, 240 people classified their occupation as chef or head cook. Out of these 240 people, 200 were men and the rest were women. What proportion of women are among chefs and head cooks?2710815478155003. The nurse in your school district would like to study the proportion of all high school students in the district who usually get at least eight hours of sleep on school nights. Suppose each student in your class takes a random sample of 20 high school students in the district and each calculates their sample proportion of students who said that they usually get at least eight hours of sleep on school nights. Below is a histogram of the sampling distribution.a. Do you think that the proportion of all high school students who usually get at least eight hours of sleep on school nights could have been 0.4? Do you think it could have been 0.55? Could it have been 0.75? Justify your answers based on the histogram.b. Suppose students had taken random samples of size 60. How would the distribution of sample proportions based on samples of size 60 differ from those of size 20?4. Which of the following will have the smallest standard deviation? Explain your reasoning. A sample distribution of sample means for the samples of size:a. 15 b. 25 c. 1005. In the United States, a survey of 1152 adults ages 18 and over found that 403 of them pretend to use their smartphones to avoid talking to someone. Identify the population and sample.Lesson 16.7: Margin of ErrorLearning Goals:How do we calculate and interpret margin of error?What is a confidence interval?What is the relationship between sample size and margin of error?Inferential statistics are used to draw conclusions about a population using a sample. Once we have created our sampling distribution and arrived at our best estimate of the parameter of the population, we need to reveal just how good this estimate may be. Instead of stating the estimate as a single value, we form intervals surrounding the estimate.The margin of error is a measurement of how accurate we believe our sample statistic to be relative to the population. We will always be using a 95% margin of error. In any normal distribution, how much of the data falls within two standard deviations of the mean? 2 SD ≈95%9525808990Margin of Error =2(SD)Use this number to move away from the mean!00Margin of Error =2(SD)Use this number to move away from the mean!4543425000Since roughly 95% of all normally distributed data fall within two standard deviations of the mean, we use two standard deviations to develop the margin of error.A confidence interval is a range of values we are fairly sure our true value lies in. We then use the margin of error to create a confidence interval.4705350-1035050000Confidence Interval =Mean ±Margin of Error00Confidence Interval =Mean ±Margin of ErrorExample 1: A recent poll found that 36% of all respondents would vote for Candidate A in an election. The poll reported a margin of error of 4%. Give an interpretation of what this margin of error means in terms of the 36% support for Candidate A. 32% to 40% of respondents should vote for Candidate A.36±4= 32%-40% or 95% confidence interval! Example 2: In any normal distribution, how much of the data falls within two standard deviations of the mean?In general, for a known population proportion, about 95% of the outcomes of a simulated sampling distribution of a sample proportion will fall within two standard deviations of the population proportion.One caution is that if the proportion is close to 1 or 0, this general rule may not hold unless the sample size is very large. If the sample is large enough to have at least 10 of each of the two possible outcomes in the sample but small enough to be no more than 10% of the population, the following formula (based on an observed sample proportion p) can be used to calculate the margin of error. The standard deviation involves the parameter p in the standard deviation formula. This estimated standard deviation is called the standard error of the sample proportion.1333500If p is the sample proportion for a random sample size n from some population and if the sample size is large enough,estimated margin of error=2p1-pn00If p is the sample proportion for a random sample size n from some population and if the sample size is large enough,estimated margin of error=2p1-pn1. At the beginning of the school year, school districts implemented a new physical fitness program. A student project involves monitoring how long it takes tenth graders to run a mile. The following data were taken midyear.a. What is the estimate of the population mean time it currently takes tenth graders to run a mile based on the following data (minutes) from a random sample of ten students6.5, 8.4, 8.1, 6.8, 8.4, 7.7, 9.1, 7.1, 9.4, 7.5Mean= 6.5+8.4+8.1+6.8+8.4+7.7+9.1+7.1+9.4+7.510=7.9b. The students doing the project collected 50 random samples of 10 students each and calculated the sample means. The standard deviation of their distribution of 50 sample means was 0.6 minutes. Based on this standard deviation, what is the margin of error for their sample mean estimate? Explain your answer.σ=0.6 Margin of Error=20.6=1.2 minutesWhen it says margin of error it means 2SD from the mean!c. Interpret the margin of error you found in part (b) in the context of this problem.95% confident that the mean is 1.2 away from 7.9.7.9±1.2=6.7-9.1 →95% confident the true mean is within this interval OR95% of the data should fall between 6.7 to 9.1 minutes OR95% chance that a selected student from the sample falls between 6.7 to 9.1 minutes.2. What conjecture can you make about the relationship between sample size and margin of error? Explain why your conjecture makes sense.Recall: Central Limit Theorem. If you increase the sample size, the margin of error should decrease! More people leads to better data, closer to the true mean and less spread out.3. An Algebra 2 class conducts a survey of a random sample of 50 students to determine what percent of the student body lives in a household where the annual income is over $60,000. According to their survey, 42% of the students live in such a household. The students conduct a series of simulations to determine the margin of error for this sample proportion. The results of the simulations lead the students to conclude that the actual percent of students who live in families with an income over $60,000 is 42%±8%. Based on this margin of error, it is most unlikely that this percent of students who live in households earn over $60,000 per year.(1) 33% (2) 48% (3) 42% (4) 50%Margin of error = 95% or 2 SD from mean42±8=34-50 31610301182370004. An ecologist wants to know what percent of the 10,000 fish in a lake are cod. She takes 500 samples of size 50. The average for all these samplings is 0.24 with a standard deviation of 0.06. This is the histogram of the sampling distribution of the sample proportion. Using this data, with a 95% confidence interval, we can determine that the percent of fish in the lake that are cod is which of the following?(1) Between 0.12 and 0.36(2) Between 0.14 and 0.34(3) Between 0.20 and 0.28(4) Exactly 0.2495% confidence interval= 2 SD from mean Mean =0.24Margin of Error =20.06=0.12SD =0.06.24±.12=.12-.363790950590550005. A concert promoter wants to estimate the average age of the 20,000 people attending a Taylor Swift concert. He takes 150 samplings of 80 people. The average of the means of all the samplings is 25.5 and the standard deviation is 1.5. The following is a histogram of the sampling distribution of the sample mean. Based on this data, with a 95% confidence interval, the researchers can determine that the average of the entire 20,000 person population is which of the following?(1) Exactly 25.5(2) Between 22.5 and 28.5(3) Between 23.5 and 27.5(4) Between 24.5 and 26.595% confidence interval= 2 SD from mean Mean =25.5Margin of Error =21.5=3SD =1.525.5±3=22.5-28.52559050160655006. A candidate for political office commissioned a poll. His staff received responses from 900 likely voters and 55% of them said they would vote for the candidate. The staff then conducted a simulation of 1000 more polls of 900 voters, assuming that 55% of voters would vote for their candidate. The output of the simulation is shown in the diagram below. Given this output, and assuming a 95% confidence 4305300000level the margin of error for the poll is closest to:(1) 0.01 (2) 0.03 (3) 0.06 (4) 0.1295% confidence interval= 2 SD from mean Mean =0.55.55±x=.52-.58SD (x)=0.037. Suppose that 62 random samples based on ten student responses to the question, “How many text messages do you send per day?” resulted in the 62 sample means (rounded) shown in the dot plot below.a. Based on this dot plot, would you be surprised if the actual mean number of text messages sent per day for all eleventh graders in the school is 91.7? Why or why not?b. The standard deviation of the above distribution of sample mean number of text messages sent per day is 7.5. Use this to calculate and interpret the margin of error for an estimate of the population mean number of text messages sent daily by eleventh graders (based on a random sample of size 10 from this population).8. An orange-juice processing plant receives a truckload of oranges. The quality control team randomly chooses three pails of oranges, each containing 50 oranges, from the truckload. Identify the sample and the population in the given scenario. sample = pails of orangespopulation = truckload of orangesState one conclusion that the quality control team could make about the population if 5% of the sample was found to be unsatisfactory. This is likely to occur! (95% should be satisfactory, which means 5% should be unsatisfactory) Homework 16.7: Margin of Error1. Decide if each of the following statements is true or false. Explain your reasoning in each case.a. The smaller the sample size, the smaller the margin of error.b. If the margin of error is 0.05 and the observed proportion of red chips is 0.45, then the true population proportion is likely to be between 0.40 and 0.50.23248025113474500. A sports physician conducts an observational study to learn the average amount of time that 3,000 swimmers in the town can hold their breath underwater. He uses 150 samplings of 60 people. The average of the means of all the samplings is 72.7, and the standard deviation is 0.92. This is a histogram of the sampling distribution of the sample mean. Based on this data, with a 95% confidence interval, the researchers can determine that the actual average amount of time the entire population can hold their breath under water is which of the following?(1) Exactly 72.7(2) Between 72 and 73.4(3) Between 71.28 and 73.12(4) Between 70.86 and 74.543. If the mean of a data set is 60 and the standard deviation is 8, what is the z-score for the number 72?(1) 0.67 (2) 1.00 (3) 1.25 (4) 1.504. In a set of 500 samples, the mean is 90 and the standard deviation is 17. If the data are normally distributed, how many of the 500 are expected to have a value between 93 and 101?(1) 82 (2) 84 (3) 86 (4) 88Lesson 16.8: Difference Values and Statistical SignificanceLearning Goals:What is a Diff value?How do we determine if a Diff value is statistically significant? What conclusions can we draw?Warm-Up: 20 Adult drivers were asked the following question: “What speed is the fastest that you have driven?”The table below summarizes the fastest speeds driven in miles per hour (mph)a. Calculate the mean of these speeds to the nearest hundredth. x=69.25 We now want to separate our value into 2 groups by random assignment. The speeds were randomly placed in a bag and chosen one by one. The first value was placed in group 1, the second value in group 2, etc….until the 20 values have been randomly assigned to one of the groups:b) Do you expect the means of these two groups to be equal? Why or why not? No because of sample variability… groups will have different numbers.c) Calculate the means of these two groups xA and xB to the nearest tenth. Write the means in the chart. Group A mean =xA =71 Group B mean =xB=67.5d) How do these two means compare to each other? To the mean of the whole group? xA> xB by 3.5 mph 71+67.52=69.25Close to the actual mean, but not exact. e) Calculate the difference between the two means xA and xB for group A and group B. This is called a “Diff” value. xA- xB=3.5 (Diff value or Diff of means) On the average, group A was 3.5 mph faster than group BKey Ideas about Difference ValuesThe means of the two groups will tend to differ by chance. xA- xBA “Diff” value can be calculated by subtracting the mean for the second group from the mean for the first group. It can be positive or negative. Above or below the middleThe distribution of “Diff” values will be centered at 0.The shape of the distribution of “Diff” values will be symmetrical and approximately normal.Interpreting the Difference Value: Imagine that 10 tomatoes of varying shapes and sizes have been grown under similar conditions regarding soil, water, and sunlight, but 5 of the tomatoes received an additional nutrient supplement. The tomatoes have then been divided into two groups (A and B). The mean weight of each group has been calculated. xA-xBa. Explain what a Diff value of 1.64 ounces would mean in terms of which group has the larger mean weight and the number of ounces by which that group’s mean weight exceeds the other group’s mean weight.xA-xB=1.64 or xA> xB by 1.64 ounces On average, group A tomatoes are 1.64 ounces greater than group B tomatoes.b. Explain what a Diff value of -0.4 ounces would mean in terms of which group has the larger mean weight and the number of ounces by which that group’s mean weight exceeds the other group’s mean weight.xA- xB=-0.4 or xB>xA by 0.4 ounces On average, group A tomatoes are .4 ounces less than group B tomatoes.c. Explain what a Diff value of 0 ounces would mean regarding the difference between the mean weight of the 5 tomatoes in Group A and the mean weight of the 5 tomatoes in Group B.xA- xB=0 so xA= xB On average, group A tomatoes equal to (in ounces) group B tomatoes.Statistical SignificanceStatistically significant means that the difference you found did not occur by chance. It means there actually is a difference in the means and you showed what you were trying to show.Count up the number of values that are as extreme as or more extreme than the difference value.Put that number over the total trials to find the percent of times this value occurs.49532559080500If it occurs more than 5% of the time → it is likely to occur → NOT statistically significant→ this means that the difference was due to chanceIf it occurs 5% or less of the time → it is unlikely to occur → it IS statistically significant→ this means that the difference was due to the treatment in the experiment.Determining Statistical Significance1. Below is a dot plot that shows 20 difference values obtained from 20 possible randomizations.a. Based on your dot plot, what is the probability of obtaining a Diff value of 10 or higher? 120=.05=5% b. Would a Diff value of 10 or higher be considered a difference that is likely to happen or one that is unlikely to happen? Explain. Unlikely to happen (statistically significant) because ≤5%.c. Based on your dot plot, what is the probability of obtaining a Diff value of -2 or smaller? 920=.45=45% d. Would a Diff value of -2 or smaller be considered a difference that is likely to happen or one that is unlikely to happen? Explain. Likely to occur because 45%>5% (not statistically significant)2. Below is a randomization distribution of the value Diff based on 100 random assignments.394335019621500a. Would a Diff value of -0.6 or less be considered a statistically significant difference? Why or why not? 29100=29% Not statistically significant because 29%>5%b. Would a Diff value of -1.4 or less be considered a statistically significant difference? Why or why not? 4100=4% Significant because 4%<5%Making Conclusions3. One hundred students are randomly divided into two equally-sized groups. Each member of the first group (group A) is to take their test while listening to music. The second group (group B) is to take their test in a quiet setting. Students in both groups are given the same tests during the year. A summary of the two groups’ final grades is shown below:Calculate the mean difference in the final grades (group A – group B) and explain its meaning in the context of the problem.xA-xB=82.25-85.12=-2.87 on average, group A scored 2.87 points lower than group BA simulation was conducted in which the students’ final grades were re-randomized 500 times. The results are shown below:3943350-9398000Use the simulation to determine if there is a significant difference in the final grades. Explain your answer.Diff =-2.872+7+8+16500= 33500=0.066=6.6% not statistically significant because 6.6%>5%4. 10 random tomatoes were randomly assigned to two groups, a control group where no nutrient treatment was given to the tomatoes over a period of ten days and an experimental group where the nutrient treatment, Grow More, was given to the tomatoes each day for ten days. The researchers are interested in determining if Grow More had a statistically significant difference in the weight of the tomatoes after ten days. When the tomatoes were measured after the ten days, the following were the weights in ounces of the two groups.a. Find xA and xB and the difference value for the two groups.b. Interpret the difference values in the context of this problem.In order to determine if this was a statistically significant difference, the researchers ran a simulation which randomly assigned the weights of the tomatoes into two groups, calculated the mean of each group, and found the difference value. This process was repeated 250 times and the resulting “Diff” values are in the dot plot below where each increment represents 0.04 ounces.c. Using the distribution above, determine if the “Diff” value you found is an unusual occurrence. Support your decision using the results of the simulation above. Explain your reasoning.Homework 16.8: Difference Values and Statistical Significance1. Below is a randomization distribution of the value Diff based on 100 random assignments of these 20 observations into two groups of 10. Would a Diff value of 1.2 or greater be considered a statistically significant difference? Why or why not?2. In which of the following situations is a sample mean of 55 most likely to be closest to the actual population mean?(1) 100 students sampled and 1000 in population(2) 100 students sampled and 500 in population(3) 50 students sampled and 500 in population(4) 50 students sampled and 1000 in population3. A statistical simulation for a sample size of 100 items shows the mean to be 0.64 and the standard deviation of the sampling distribution of the sample proportions to be 0.048. What is the margin of error for this simulation for a confidence level of 95%?(1) 0.096 (2) 0.072 (3) 0.103 (4) 0.0514. In a recent experiment study, 40 participants were broken into two groups. Group A maintained a regular exercise routine while Group B maintained the same exercise regime but also drank a cup of green tea each day. At the end of the month, the weight loss of each of the 40 subjects was recorded and means for each group were produced as follows:Groups A: xA=5.3 pounds Group B: xB=8.6 poundsa. What was the value of xA- xB? Using proper units, explain the meaning of this calculation in terms of this experimental study?b. The weight loss from the 40 participants were randomly shuffled into two treatment groups 100 times and the distribution of the differences in sample means, specifically xA- xB, is shown below.c. Based on this distribution, how significant is the induced (treatment) variability to the weight loss of the participants compared to natural variability? Explain your thinking. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download