Annalyn.weebly.com



Annalyn HegemannMath 1040-001Term ProjectPART ONEThe data set our group chose to serve as our population for the entire project was the Exhale study. This consisted of 654 lines of data in the entirety. PART TWO We first took the categorical variable – age – and created a pie chart which would show the population portion of this category. For the next portion I chose to create a Simple Random Sample with a sample size of 35. To create a simple random sample I took the full list of data, I put it in Excel then created a new column which I used to create a random set of numbers. In the first row I typed in =RAND() and hit Enter. This created a random set of numbers for each of the rows of data. I did a Sort seven times which continued to mix up the randomized numbers. I then removed the randomization function (paste/values), sorted from smallest to largest, and took my sample size of 35. I plugged this data into StatCrunch, put it in a Pareto Chart layout, and labeled the X and Y axis. Next I chose to create a Systematic Sample with a sample size of 35. To create this I first determined that if I wanted 35 lines of data out of 654 I would divide 35 by 654 to come up with 18. That gave me my ‘K’ number. I then repeated the =RAND() procedure from the SRS and then took every 18th line of data. I again put that into StatCrunch, created a Pareto Chart and labeled the X and Y axis. By changing the way in which I received my samples (simple random vs systematic) it barely altered the results. In fact, Youth had the same count while Elementary only differed by one. The biggest difference was with the smaller figures. When comparing this with the population, the Systematic method more closely matched the large-to-small categories of Youth, Elementary, Junior, Adolescent, Toddler (these last two were reverse but the rest matched up). As such I feel that we can get a good representation of the population based on these two sample methods. PART THREE We chose Forced Expiratory Volume as our quantitative variable for this portion of the project. We computed the following information using StatCrunch: Population Mean = 2.637, Population Standard Deviation: 0.8671For the samples I chose Simple Random Sample and Systematic again (using the same methodology for getting the sample as I previously mentioned in the Part 2 reflection). When comparing the two different bar graphs and box plots, I was surprised to see the Systematic sample mean and standard deviation numbers so far away from the population mean and standard deviation. Even with the SRS standard deviation being .03 away from the population standard deviation – which seemed a touch large – it was more than double (.06) that of the systematic approach. Additionally, the histograms for the Systematic and SRS showed only a slight resemblance to each other. They would both fall under the Normal Distribution but they still showed a slightly different story from each other. Included is a comparison of the numbers for each set of data. Population Mean: 2.637Population Standard Deviation: 0.8671Simple Random Sample Mean: 2.598SRS Standard Deviation: 0.8995Systematic Sample Mean: 2.828Systematic Standard Deviation: 0.8052Simple RandomOutlier: 5.083 (Min: 1.338, Q1: 1.751, Q2: 2.384, Q3: 3.004, Max: 4.22)SystematicOutlier: none (Min: 1.624, Q1: 2.069, Q2: 2.798, Q3: 3.32, Max: 4.393)PART FOUR 314642528892500Categorical Variable: Age Range (calculations)P: 11% (pop. proportion 35/322 youth)n: 322 (Youth) x: 35 (sample size)p-hat: .11 (35/322)95% Confidence Intervala=.05 a/2=.025 z-score=1.96977694165890000Quantitative Variable: Forced Expiratory Volumes (calculations) ?: 2.6368?: 0.867190% Confidence Intervala=.10 a/2=.05 z-score=2.032Sample 1(srs) – n:35, x-bar: 2.8329, s:0.7678I am 90% confident that our population mean is between 2.6 and 3.12109445927051000Sample 2 (systematic) – n:35, x-bar: 2.8291, s:0.7766I am 90% confident that our population mean is between 2.56 and 3.10When we look at the Quantitative variable and do a 90% confidence interval we can see that in the first sample (srs) it ranges from 2.56 to 3.10, and in the second sample (systematic) it ranges from 2.60 to 3.12. These are both very close to each other so it’s no surprise that when we compare it to the population mean of 2.64, we see that it falls between both of these sets of confidence intervals. As such, these intervals did capture the population parameter. PART FIVE The mean Forced Expiratory Volume in liters is 2.637. My claim is that when taking 130 simple random samples from the Youth category, it will result Ho: p=.5 H1: p≠ .5. Requirements Simple Random Sample Fixed number of independent trials with two categories np greater than or equal to 5, and nq is great than or equal to 51754136-13300N: 322p-hat: 130 / 322 = .404p: .50q: 1-.50This is a two-tailed test because the alternative hypothesis was that p≠ 5. The test statistic -3.455 equals an area of .0003 so we need to twice this area (.0006). I used a=.05 as my level of significance. With this claim a Type 1 error is made since .0006 is smaller than .05 and therefore I reject the null hypothesis. PART SIXIn this project I attempted to take data regarding the impacts of smoking on individuals ranging in ages from 3 – 19, based on their forced expiratory volumes in liters, and draw conclusions based on taking samples and using statistics to tell a story about how well the samples chosen mirrored the full data set. I have learned that doing something as comprehensive as a census is next to impossible and would require more man power and money than most groups can provide, yet the need to understand aspects of the population are important in almost all facets of our lives. As such, statistics has taught me how samples can be pulled (and which ones are effective vs ineffective and flawed) and how the information drawn from this sampling of people can be tested to understand if it truly represents the entire population or if it does not capture that information. I’ve already found myself reading articles that throw out statistics and I find myself picking out the flaws with their data. That has been fantastic! As for the mathematical calculations and its impact on my future, I don’t see that it will make a huge difference at this juncture in my life. I’m already 16 years into a career that is heavy on writing and has almost zero mathematics needed. But the logic used in statistics and learning to not just take something at face value is invaluable in our electronic age – when we’re bombarded with ‘facts’ about the world around us yet rarely taught how to dissect these things to get to the truth in the story. That is how this project changed my way of thinking about real-world math applications. It’s around us more than we know but it’s digging at the data and understanding how to identify good math and good statistics that can make a big difference in our view of things. I have been pleasantly surprised to find that although I struggle with math, I actually began to understand much of what was taught in this class and on this project, though admittedly I still have a long ways to go to feel proficient on the topic and I feel that there were areas in which I struggled a lot. I do look forward to applying my knowledge on a personal level and continuing to expand my understanding of the topic. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download