Contingency Tables With Ordinal Variables



The (N-1) Chi-Square: Contingency Tables With Ordinal Variables and 2 x 2 TablesContingency Tables with Ordinal VariablesDavid Howell presents a nice example of how to modify the usual Pearson 2 analysis if you wish to take into account the fact that one (or both) of your classification variables can reasonably be considered to be ordinal (Statistical Methods for Psychology, 8th ed., 2013, pages 317-319). Here I present another example.The data are from the article "Stairs, Escalators, and Obesity," by Meyers et al. (Behavior Modification 4: 355359). The researchers observed people using stairs and escalators. For each person observed, the following data were recorded: Whether the person was obese, overweight, or neither; whether the person was going up or going down; and whether the person used the stairs or the escalator. The weight classification can reasonably be considered ordinal. The data are in Escalate.sav on my SPSS Data Page.Before testing any hypotheses, let me present the results graphically:Percentage Use of Staircase Rather than Escalator Among Three Weight GroupsInitially I am going to ignore whether the shoppers were going up or going down and test to see if there is a relationship between weight and choice of device. Here is the SPSS output:Notice that the Person Chi-Square is significant. Now we ask “is there a linear relationship between our weight categories and choice of device?” The easy way to do this is just to use a linear regression to predict device from weight category.Model SummaryModelRR SquareAdjusted R SquareStd. Error of the Estimate1.029a0.000845.001.345a. Predictors: (Constant), weightAs you can see, the linear relationship is not significant. If you look back at the contingency table you will see that the relationship is not even monotonic. As you move from obese to overweight the percentage use of the stairs rises dramatically but then as you move from overweight to normal weight it drops a bit.A chi-square for the linear effect can be computed as2 = (N – 1)r2 = 3215(0.000845) = 2.717, within rounding error of the “Linear by Linear Association” reported by SPSS.We could also test the deviation from linearity by subtracting from the overall 2 the linear 2: 11.752 – 2.717 = 9.035. The df are also obtained by subtraction, overall less linear = 2 – 1 = 1. P(2 > 9.035 | df = 1) = .0026. There is a significant deviation from linearity.Now let us split the file by the direction of travel. If we consider only those going down, there is a significant overall effect of weight category but not a significant linear effect:If we consider only those going up, there is a significant linear effect, and the deviation from linearity is not significant 2(1, N = 1362) = 2.626, p = .1052 x 2 Chi-SquareCampbell (2007) has shown that the (N-1) 2 is the best procedure to use when conducting 2 x 2 contingency table analysis, and Busing, Weaver, and Dubois (2015) presented code for computing it with several stats packs, including R, SAS, and SPSS. Here is an example using SAS and SPSS with the Howell data set.CorrelationssocprobdropoutPearson Correlation0.435897**Sig. (2-tailed).000N88**. Correlation is significant at the 0.01 level (2-tailed).The phi coefficient, .435897, differs significantly from zero. The (N-1) 2 is 87(.435897)2 = 16.531. Here is some SPSS crosstabs output:socprob * dropout CrosstabulationdropoutTotalgraduateddroppedsocprobno_probsCount73578Expected Count69.18.978.0% within socprob93.6%6.4%100.0%problemsCount5510Expected Count8.91.110.0% within socprob50.0%50.0%100.0%TotalCount781088Expected Count78.010.088.0% within socprob88.6%11.4%100.0%While only 6.4% of the students with no social problems dropped out, 50% of those with social problems dropped out. Notice the troublesomely low expected frequency in one cell.Chi-Square TestsValuedfAsymptotic Significance (2-sided)Pearson Chi-Square16.7211.000Linear-by-Linear Association16.5311.000N of Valid Cases88a. 1 cells (25.0%) have expected count less than 5. The minimum expected count is 1.14.SPSS calls the (N-1) 2 the “Linear-by-Linear Association.”Symmetric MeasuresValueApproximate SignificanceNominal by NominalPhi.436.000The phi here is simply the correlation between the two dichotomous variables.Risk EstimateValue95% Confidence IntervalLowerUpperOdds Ratio for socprob (no_probs / problems)14.6003.14467.791The odds of dropping out are 14.6 times higher for students with social problems than for those without social problems.Now, using SAS: PROC FREQ; TABLES socprob*dropout / CMH; run;Cochran-Mantel-Haenszel?Statistics?(Based?on?Table?Scores)StatisticAlternative HypothesisDFValueProb1Nonzero Correlation116.5306<.00012Row Mean Scores Differ116.5306<.00013General Association116.5306<.0001The CMH statistic produced by SAS is identical to the (N-1) 2.ReferencesBusing, F. M. T. A. Weaver, B., & Dubois, S. (2015). 2 × 2 tables: a note on Campbell’s recommendation, Statistics in Medicine. doi: 10.1002/sim.6808Campbell, I. (2007). Chi-squared and Fisher–Irwin tests of two-by-two tables with small sample recommendations, Statistics in Medicine. doi: 10.1002/sim.2832Equivalence of the Linear-by-Linear Chi-Square and the N-1 Chi-Square for 2×2 Tables Return to Wuensch’s Stats Lessons PageKarl L. Wuensch, East Carolina University, March, 2019. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download