Statistical Tests - Loudoun County Public Schools



APPENDIX B Statistics for Science Research: Hints, Tips, ExamplesScientists analyze data collected in an experiment to look for patterns or relationships among variable. If we think we see a pattern or a relationship, we must complete one more step before we can be sure of the results. In order to determine that the patterns we observe are real, and not due to chance and our own preconceived notions, we must test the perceived pattern for significance.Statistical analysis allows scientists to test whether or not patterns are real, and not due to chance or preconceived notions of the observer. We can never be 100% sure, but we can set some level of certainty to our observations. A level of certainty accepted by most scientists is 95%. We will be using tests that allow us to say we are 95% confident in our results.STEP ONETypes of Data1. Qualitative - data using non-standard scales (descriptions of leaf quality). Qualitative data are placed into categories that may be discrete categories represented by word or number labels. It can also be measurements made with a nonstandard scale with unequal intervals. Levels of MeasurementNOMINAL DATAobjects are placed into categories that cannot be ranked (male/female or brown, black, red hair) ORDINAL DATAobjects are placed into categories that can be ranked (Moh’s hardness scale or animal activity ranked 1- 5)2. Quantitative - measurements made using a scale with equal intervals (temp of water in Celsius degrees). Quantitative data consists of numbers representing counts or measurements.Levels of MeasurementRATIO DATAdata collected using a scale with equal intervals and with an absolute zero (temp, velocity)INTERVAL DATAusing a scale with equal intervals but no absolute zero (temp change, pH)Decide which of the above types of data you have collected and record here: ____________________________STEP TWODescriptive StatisticsType of Descriptive StatisticQuantitativeInterval RatioQualitativeNominal OrdinalCentral Tendency - the # most typicalMeanMode MedianVariation - spread of dataRangeVarianceStandard DeviationFrequency DistributionModevalue that occurs most often (in a tie, use both)Medianmiddle value when ranked highest to lowestxMeanmathematical averageRangedifference between the smallest and largest averageVarianceaverage squared distance from the mean (how spread out the values in a set of data are).SXStandard Deviationa measure of how closely the individual points of datacluster around the mean.Frequency Distribution # of cases falling into each category of the variablenNumbernumber of data pointsUse the table above to decide which type of descriptive statistics you will do and list them here Using the TI84 Plus to Find Descriptive Statistic ValuesThe buttons on the calculator are indicated in bold.Push the ON button.Push 2nd (blue key) then ListPush >> until MATH is highlightedArrow down until MEAN is selected. Hit ENTEREnter your list of data points according to this format: ({5,6,7,3,10}) then hit ENTERThe answer should be displayed to the right.Push 2nd then ListPush >> until MATH is highlightedArrow down until MEDIAN is selected. Hit ENTEREnter your list of data points according to the this format: ({5,6,7,3,10}) then hit ENTERThe answer should be displayed to the right.Repeat the above steps for standard deviation and variance.__________________________________________________________________STEP THREEFor Quantitative:Follow the directions above for using the TI84 Plus and record these values here: Mean ______________ Range _______________ Variance _____________ Standard Deviation___________For Qualitative:Determine the mode, median and frequency distribution and record here: Mode _____________ Median __________________ Frequency Distribution ___________________________STEP FOURInferential StatisticsInferential statistics are done to determine if the data is statistically significant. They limit the possibility that the data differences occurred by random chance or due to some unknown, uncontrolled variable. If the data is shown to be statistically significant than the data differences can be explained by changes in the independent variable.Statistical TestsThe t-test (or Analysis of Variance): An analysis of variance is used when you have two or more groups and you want to compare measurements of each group. The t-test analyzes the relationship between two groups.The Chi-square test: This test is used when you have counts that can be placed into yes or no categories, or other simple categories such as quadrats.The Pearson R Correlation: The Pearson R Correlation allows you to test how the values of one event or object relates to the values of another event or objectQuantitativeQualitativeInferential Statisticst-Test (t) Pearson R CorrelationChi- Square (x2)Decide which of the inferential statistics you will be doing; calculate your Degrees of Freedom. Record here: Stats ____________________Degrees of Freedom ______________Level of Significance -We will use 0.05 which means that the probability of error in the research is 5/100 (95%)dfDegrees of Freedom -Represents the total number of observations in a sample. To calculate: For t-test df = (n1-1) + (n2-1)For Chi-square test df = (#rows – 1) (#columns – 1)For Pearson R correlationdf = (n-2) subtract 2 from the number of comparisons made.μ Null Hypothesis -Basically states that there is no difference between the mean of your control group and the mean of your experimental group. Therefore any observeddifference between the two sample means occurred by chance and is not significant. If you can disprove your null hypothesis then there is a significant difference between your control and experimental groups.STEP FIVE Three options for your null hypothesis μ1= μ2This states that the two means are equal (experimental {1} and control {2}). To use this to reject your null hypothesis, your t-value must be > table value or your x2 calculated > x2 table.μ1< μ2This states that the mean of your experimental group is lower than the mean of the control group. For example, in golf, the lower score is the better score. To use this to reject your null hypothesis your t-value must be < table value or your x2 calculated > x2 table.μ1>μ2This states that the mean of your experimental groups is higher than the mean of the control group. For example, plants with fertilizer grow higherthan those without. To reject your null hypothesis your t-value must be > table value or your x2 calculated > x2 table.Write your null hypothesis here: ________________________________________________________________________Graphing calculators are helpful in determining T-TEST and CHI-SQUARE. Using the TI84 Plus to Find Inferential Statistic ValuesTTESTThe buttons on the calculator are indicated in bold.Push the ON buttonPush 2nd MEMArrow down to 4:ClrAll Lists and hit ENTERHit ENTER againThe screen should say DONEPush STATSelect 1: Edit by hitting ENTERUnder L1, type in the data from your experimental group. Type in the numbers and hit ENTER in between each.Arrow over to L2 and type in the data from your control group.When done hit STAT again.Push >> to get to TestsArrow down to option 4:2SampTTest and hit ENTERMake sure that Data is highlighted.Arrow down and select the correct null hypothesis; ?1 ≠ ?2, ?1 < ?2, ?1 > ?2Make sure Pooled is set to NOArrow down to CALCULATE and hit ENTERYour tvalue is indicated by t =CHISQUAREPush 2nd MEMArrow down to 2: Delete and hit ENTERArrow down to 5: Matrix and hit ENTERHit enter for each Matrix [A], [B], entry that is listedExample: A researcher tests the hypothesis that there was no significant difference in the amount of graphing calculator use demanded by the different tests given to the three senior classes at Roosevelt High. She analyzed each of the three 50item tests and classified each item as inactive, neutral, or active depending on the extent of calculator use required. Use the tallies shown in the 3x3 matrix to test the hypothesis. Test A Test B Test C Inactive 16 19 13 Neutral 14 10 26 Active 20 21 11To enter the data in your matrixNote: Your matrix must be at least a 2 x 2, if you have a 1 x 2 please askyour teacher for additional instructions.Push 2nd then push MATRIXPush >> to get to EDIT (you must set up a matrix to record the data for the x 2 test) hit ENTERSet up the values for your matrix (rows x columns), the matrix for the example is 3 x 3, and select 1: [A] by hitting ENTERBegin to enter the data for the columns and rows exactly as it is in your matrix tablePush STAT and push >> to get to TESTSArrow down to C: X2Test and hit ENTERArrow down to calculate and hit ENTERYour CHISQUARE value is indicated by X2 =To view your expected values:Push MATRIXArrow over to EDIT and select 2:[B]Hit ENTER and your expected values will be listed in the B matrix.To Calculate Chi-square ManuallyUse the formula: x2= ( O - E)2 Ex2= Chi-square= Sum of the ValuesO = Observed Frequency Distribution E = Expected Frequency DistributionExample:Mary read that bees were attracted to the color yellow as opposed to red, blue, or white. She wondered if crickets would show a color preference. To test her hypothesis that crickets would be differentially attracted to colors, she placed 100 crickets in a container. To bottom of the container was divided into four equal sections covered by red, blue, yellow, or white paper. She observed the number of crickets on each color one hour after placing them in the container. The distribution of crickets was: 30 red, 40 blue, 12 yellow, 18 white. By chance alone, an equal number of crickets on each color of paper would be expected.Determine the Observed Frequency Distribution:RedBlueYellowWhite30401218Determine the Expected Frequency DistributionRedBlueYellowWhite25252525Use the formula to calculate x2PEARSON R CORRELATION COEFFICIENT To calculate the Pearson R value you must use the Microsoft Excel program on the computer. It can not be calculated using the TI calculators. Calculate your t-value, Chi-Square, or Pearson R and record here:(Note: you will have different values for each of your experimental groups) STEP SIXDeciding to Accept or Reject the Null HypothesisUse the tables for the t-test and the Chi-square test to find the table value. Use your calculated degrees of freedom and the Level of Significance of 0.05 (95%) to find the correct value.Determine if the calculated value is greater or less than the table value.For t-test:Refer to null hypothesis descriptions for decision to accept or reject the null hypothesis.For Chi-square: If x2 Calculated > x2 Table, then the null hypothesis is rejected.For Pearson R Correlation: If the calculated value is greater than the table value reject the null hypothesis.If the r = 0.00 there is zero correlation.If the r = 1.00 there is a perfect correlation.Values can be + or - . Positive values indicate increase in Xcorresponds to increase in Y. Negative values indicate increases in one value are associated with decreases in the other.Decide whether to accept or reject your null hypothesis. Accept _________ Reject ________STEP SEVENWhat Does it Mean to Accept or Reject the Null Hypothesis?The null hypothesis generally states that there is no significant difference between your two sets of data. If it is accepted, it means that any differences in your data are not significant and probably due to random chance. If the null hypothesis is rejected, it means that there is a significant difference in your two sets of data and these differences are due to the factors (independent variable) that you changed.Make a statement regarding your null hypothesis. For example: (from above):At df = 3, = 0.05, x2 = 7.815 for significance; the calculated x2 of 18.6 > 7.815 and is significant. The null hypothesis is rejected and the research that crickets would be differentially attracted to colors was supported.Your statement: ________________________________________________________________________________________________________________________________________________ANOVA Statistical Tests(to compare 3 or more groups)Websites for Free Calculators online: explanation of ANOVA see Wikipedia or below paragraphs or below websites ("Analysis of Variance"). Like the two-sample t-test, ANOVA lets us test hypotheses about the mean (average) of a dependent variable across different groups. While the t-test is used to compare the means between two groups, ANOVA is used to compare means between 3 or more groups. There are several varieties of ANOVA, such as one-factor (or one-way) ANOVA, two-factor (or two-way) ANOVA, and so on, and also repeated measures ANOVA. The factors are the independent variables, each of which must be measured on a categorical scale - that is, levels of the independent variable must define separate groups. One-Way ANOVA ExampleOne-factor ANOVA, also called one-way ANOVA is used when the study involves 3 or more levels of a single independent variable. For example we might look at average test scores for students exposed to one of three different teaching techniques (three levels of a single independent variable). ANOVA StatisticsThe null hypothesis for ANOVA is that the mean (average value of the dependent variable) is the same for all groups. The alternative or research hypothesis is that the average is not the same for all groups. The ANOVA test procedure produces an F-statistic, which is used to calculate the p-value. As described in the topic on Statistical Data Analysis if p < .05, we reject the null hypothesis. We can then conclude that the average of the dependent variable is not the same for all groups. With ANOVA, if the null hypothesis is rejected, then all we know is that at least 2 groups are different from each other. In order to determine which groups are different from which, post-hoc t-tests are performed using some form of correction (such as the Bonferroni correction) to adjust for an inflated probability of a Type I error.Examples of Statistical Data Tables Quantitative TABLE 10.5 Effect of Fertilizer on the Mean Height (cm) of Bean PlantsDescriptiveInformationCommercialCompostControl MeanVariance Standard DeviationNumber7.03.61.9 105.02.21.5104.02.01.410Results of ttestCommercial vs. Compostt = 2.6 0.01<p<0.05Compost vs.Controlt = 1.5 p >0.01Commercial vs. Control t = 4.0 p <0.00At df 18; ? of 0.01; t =2.878 for significance Qualitative TABLE 10.7 Attraction of Crickets to Various Colors InformationObserved DistributionExpected Distribution(Chance)Calculated x2 ModeFrequency DistributionRedBlueYellowWhiteNumber Blue 3040 1218 100 RedBlueYellowWhite25252525 100 1.09.06.71.9Results of the Chisquare testx2 =18.6 at df=3x2 of 18.6 > 7.815p < 0.001Tables from “Students and Research”, 2nd Edition, Cothron, Julia, Giese, Ronald, Rezba, Richard. Kendall/Hunt PublishingCompany. Dubuque, Iowa. 1993. Sample Statistical Analysis for Quantitative DataFOR EXAMPLE - A student tested the effect of different types of fertilizers on plants. Below is his data for his control and fertilizer A.Trial NumberControl GroupHeight of plant (mm)Fertilizer AHeight of plant (mm)145.047.4246.248.5351.455.2443.249.1544.152.3642.756.2741.851.9842.652.9941.851.61042.449.81143.152.71244.356.11343.257.31442.656.21543.458.2Steps for Using Excel for StatisticsEnter the data above into your Excel spreadsheet. It should look like the spreadsheet below.Set up a table below your data table for your descriptive statistics. You should include mean, range, variance, and standard deviation.3.Click in the cell for the mean of the control. Click on Formula on the Tool Bar. Click on fx and the insert function will box will open. This will allow you to insert a formula into the spreadsheet. The Mean of a set of numbers is the Average. In the select category box, select Statistics. Under select a function, select Average and then click OK. A box titled Function Arguments will open. Take the mouse and highlight the numbers. A dotted line will appear around the column.You will see that the average has been calculated to be 43.85333. Click OK. The average will be transferred to the mean cell in the spreadsheet. 8. Repeat steps 3 – 7 to calculate the mean for the data for Fertilizer A. The mean value youcalculate for Fertilizer A should be 53.02667.9.To calculate the Range, subtract the smallest number from the largest number. Enter the value into the cell for that value. 10. To calculate the variance, repeat steps 3 – 7 selecting VAR from the menu.11. To calculate the standard deviation, repeat steps 3 – 7 selecting STDEV from the menu.12. Your calculations should give you the following values:ControlFertilizer AMean43.853353.0267Range9.600010.8000Variance5.762711.5192Standard Deviation2.40063.3940 13. We are going to calculate a value for the t-test. In the area below the standard deviation value, type the word T-Test. 14. Click on the cell next to the T-Test cell.15. Click on Formula on the Tool Bar. Click on fx and the insert function will box will open. 16. In the selection area, select TTEST. Your screen should look like this: 17. Click on OK. Your screen should look like this: 18. Click in the box next to Array1. Highlight the numbers in the control column. 19. Click in the box next to Array2. Highlight the numbers in the Fertilizer A column.20. Click in the box next to Tails. If you have a one-tailed test, type in one. If you have a two-tailed test, type in two. 21. What is the meaning of a two-tailed test?? If you are using a significance level of alpha = 0.05, a two-tailed test allots half of your alpha to testing the statistical significance in one direction and half of your alpha to testing statistical significance in the other direction.? This means that .025 is in each tail of the distribution of your test statistic. When using a two-tailed test, regardless of the direction of the relationship you hypothesize, you are testing for the possibility of the relationship in both directions.22. For a one tailed test, you are testing for the possibility of the relationship in either the left-tail area or the right tail area. 23. We are doing a two-tailed test so you need to enter a two next to tails.24. Click in the box next to Type. If you are doing a paired test, enter 1. If you are doing a t-test in which the two samples have equal variances, you would type a 2. If the two samples have unequal variances, type 3. Our variances are not equal, so type 3.25. Your screen should look like this:26. Click on OK. 27. You get a value of 6.46129E-09. This is the probability that the results happened by chance. Since the p-value is so small, you would reject the null hypothesis.Making a graph of your data.You want to graph your descriptive statistics. Highlight your descriptive statistics.1. Click on Insert on the Toolbar. 2. Click on the type of graph your want. Click on the columns. 3.Click on 2-D columns.4.If your graph covers your data, you can click on the graph and move the graph.Sample Statistical Analysis for Qualitative DataFOR EXAMPLE - A student tested the effect of different types of fertilizers on plants. The students developed a rubric for the health of the parts. A 1 was not very healthy and a 5 was very healthy. Below is his data for his control and the different strengths of fertilizer A.Trial NumberControl GroupHealth of plant Fertilizer A 2 %Health of plant Fertilizer A 4%Health of plant Fertilizer A 6 %Health of plant 134452444433445434555445563455734448345594455103445113455124445134454143345153355Steps for Using Excel for Statistics1. Enter the data above into your Excel spreadsheet. It should look like the spreadsheet below.2. Set up a table below your data table for your descriptive statistics. You should include the mode and the median.3. Click in the cell for the mean of the control.4.Click on Formula on the Tool Bar. Click on fx and the insert function will box will open. This will allow you to insert a formula into the spreadsheet.5. In the select category box, select Statistics. Under select a function, select Mode and then click OK.6.A box titled Function Arguments will open. 7. Take the mouse and highlight the numbers. A dotted line will appear around the column.8. You will see that the mode has been calculated to be 3. Click OK. The mode will be transferred to the mode cell in the spreadsheet. Your spreadsheet should look like this.9. Repeat steps 3 – 8 to find the mode for the different percentages of Fertilizer A. The mode represents the number that appears most often. If a number does not appear more than once, you will get an error message. The column will not have a mode. Your results should look like this.Control Group Health of PlantFertilizer A 2 % Health of plantFertilizer A 4% Health of PlantFertilizer A 6 % Health of plant Mode3455Median 10. To calculate the median, repeat steps 3 – 8 selecting MEDIAN from the function list. Your results should look like this.Control Group Health of PlantFertilizer A 2 % Health of plantFertilizer A 4% Health of PlantFertilizer A 6 % Health of plant Mode3455Median3455Making a graph of your data.You want to graph your descriptive statistics. Highlight your descriptive statistics. 1. Click on Insert on the Toolbar. 2. Click on the type of graph your want. Click on the columns. 3.Click on 2-D columns.4.If your graph covers your data, you can click on the graph and move the graph.Doing Chi-Square in EXCELThere is a function in EXCEL called CHITEST. CHITEST does not return a value for Chi-Square. It skips that step and returns a probability that you will get a Chi-Square at least as high as the one you calculate from the observed values and predicted values. The problem is that the CHITEST’s degrees of freedom are not always calculated correctly. Depending on the case you can lose one or two degrees of freedom using CHITEST. Because the CHITEST is basing its answer on less than the correct degrees of freedom, it gives you an inappropriately large value for the probability.After Chi-Square has been calculated by hand, you can use the CHIDIST worksheet function to make a judgment about the Chi-Square value.1. Select a cell to store the result.2. From the Statistical Functions menu, select CHIDIST to open the Functions Arguments dialog box for CHIDIST.3. In the Functional Arguments dialog box, type the values asked for in the box.4. In the X box, type the calculated Chi-Square value. For an example, put 36 in the X box.5. In the Deg_freedom box, type the degrees of freedom. After typing the degrees of freedom, the dialog box shows the one-tailed probability of obtaining at least this value of Chi-Square.For the example we are doing, type 25 for the degrees of freedom.6. The Functional Arguments dialog box should look like this:7. Click OK to close the dialog box and put the answer in the selected cell.8. The value in the dialog box is greater than .05, so the decision is not to reject the null hypothesis. Pearson CorrelationThis is a data analysis for a t-test for a paired two sample for means.1. Enter the data for each sample into a separate data array.For example, we have the before data in column B and the after data in column C.2. Select Data, then Data Analysis to open the Data Analysis dialog box. The Data Analysis ToolPak must be loaded as an add-in.3. In the Data Analysis dialog box, scroll down the Analysis Tools list and select t-Test: Paired Two Sample for Means. 4. Click OK to open this tool’s dialog box.5. In the Variable 1 Range box, enter the cell range that holds the data for one of the samples. Click in the Variable 1 Range box, then highlight the data in the B column. The range will appear in the box.6. In the Variable 2 Range box, enter the cell range that holds the data for one of the samples. Click in the Variable 2 Range box, then highlight the data in the C column. The range will appear in the box.7. In the Hypothesized Mean Difference box, type the difference between ?1 and ?2 that Ho specifies.In this example, the difference is 0.8. If the cell ranges include column headings, check the Labels checkbox.These were included, so the box needs to be checked.9. The Alpha box has 0.05 as a default. Change that value if you want to use a different α.10. In the Output Options, select a radio button to indicate where you want the results.For this example, New Worksheet Ply was selected to put the results on a new page in the worksheet.11. Click OK.Because New Worksheet Ply was selected, a new page opens with the results.12. After the new page opens with the results, you need to expand the columns to read the results.13. Cell B7 shows a value for the Pearson Correlation Coefficient. The coefficient will be a number between -1 and +1. It shows the strength of the relationship between the data in the first sample and the data in the second sample.14. If this number is close to 1, high scores in one sample are associated with high scores in the other sample and low scores in one are associated with low scores in the other. If this number is close to -1, high scores in the first sample are associated with low scores in the second and low scores in the first are associated with high scores in the second.15. If the number is close to zero, the scores in the first sample are not related to scores in the second sample.Our example gives us a value close to one.16. Cell B9 shows the degrees of freedom.17. Cell B8 shows the Ho specified difference between the population means.18. Cell B10 gives the calculated value of the test statistic. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download