Pdixon.stat.iastate.edu



Lab 4: analysis of paired data, non-parametric testsGoals: In this lab, we will:explore two ways to run a paired t-testlook at the Wilcoxon signed rank testsee how to read an excel workbook (.xlsx file)show how to run a Wilcoxon rank sum testWe will use the case0202.txt data set for the first 3 parts. This is the hypocampus volume in twins, one with and one without schizophrenia, used as case study 2.2 in chapter 2.We will use the hamburger.csv data set for the last partDownload the case0202.txt, hamburger.csv, and hamburger.xlsx from the datasets page on the class web site.Paired t-test:We will look at two ways to get a paired t-test from JMP. The first provides all the summary statistics and tests but does not allow you to examine the differences. The second is more involved but provides everything.Use File / Open / Data using best guess (or Data with Preview) to read the file. If you use the default (Text import preferences), you get one column of information, not two. To calculate summary statistics, the paired t-test and a confidence interval for the difference, choose Analyze/ Specialized Modeling / Matched Pairs from the main menu. Put both variables ( unaff and aff ) into the Y, Paired Response box. This box should contain two variable names, one for each variable measured in the pair. You do not need to indicate the pair; each line of data is assumed to be from one pair. The dialog should look like:Then click OK.The output looks like:The plot is a plot of the difference of the scores (on the Y axis) against the average score (on the X axis). This can help diagnose problems with the analysis but we aren’t talking about this. Optional note: In some fields, this plot is called a Bland-Altman plot. If you plan on doing a lot of analysis of paired data, chat with me about how to interpret these plots.The black diamond provides an outline of possible outcomes. It is mathematically impossible for an observation to fall outside those black lines. I find them a distraction. If you want to get rid of them, click the red triangle by Matched Pairs and uncheck Reference Frame. Then all you see is the data and three horizontal lines. These are the mean difference and 95% confidence intervals for the mean difference.The numeric results are, in order down the first column: the mean for each response, the mean difference, the standard error of the mean difference, a confidence interval for the mean difference, the number of pairs, the correlation between the two responses (which we’re skipping for now). Down the 2nd column, you have the T statistic testing H0: mean difference = 0, the df for that T statistic, then the two-sided and two one-sided p-values.If you want to change the coverage of the confidence interval, click the red triangle and select “Set α level”.The rest of the red triangle options are other plots or other tests that we won’t discuss or won’t discuss right now.Important point for interpreting the results: Make sure you’re aware of the direction of the subtraction: is it unaff – aff or aff – unaff? Interpreting a difference, e.g. which group has the larger mean response, depends of the direction of the subtraction. Two pieces of JMP output indicate the direction of the difference:The label of the box: these results are for aff - unaffThe relationship between the means for each group (first two pieces of numeric output)To change the direction for the difference, swap the order of variables in the Y, Paired Response box (item 2).Paired t-test (2nd approach):The book's presentation starts by calculating the difference within each pair. When you use the first approach, JMP does that behind the scenes for you. If you want to examine the differences yourself, you need to explicitly calculate them. Here's how to do calculate the differences.Go back to the case0202 window (the one with the data) and right click on the empty column to the right of aff. Select ‘New Column’. Since this column will contain the difference, diff, is an informative column name. Type in the desired column name. Then click the black triangle by Column Properties at the bottom of the window. Find Formula in the pop-up list and select it. A dialog box will appear looking like:We will enter the formula to calculate the difference we want using the mouse to select the appropriate pieces. We will calculate aff - unaff:left click on aff in the Columns boxleft click on – (the minus sign) in the bar at the top of the windowleft click on unaff in the Table Columns boxThe dialog box should now look like (compare the right-hand parts of the before and after windows):Then click OK to evaluate the formula and store the results in a column. A new column, labelled diff, will appear in the case0202 data window. If you look at the first row, you see the value of difference (-0.67) is the value of aff (1,27) minus the value of unaff (1.94). Similarly for the rest of the rows.You can now use all the one-sample methods from last lab to evaluate the difference. As a reminder, these include:Analyze/Distribution on the diff variable to get a histogram and box plot, summary statistics and a confidence intervalTest Mean (option to the red triangle by diff after Analyze/Distribution) to get p-values for a one-sample test of the difference.Confidence Interval (option to the red triangle by diff after Analyze / Distribution) to get confidence intervals with the coverage you select.Wilcoxon signed rank test (nonparametric test for paired data):Follow the 2nd approach (calculating differences). Stop at the Test Mean dialog box. It should look like:Check the box labeled Wilcoxon Signed Rank, then click OK. You see the Signed Rank test results next to the t-test results. Again, JMP provides three p-values. The two sided p-value is the row labeled Prob > |t|.Reading excel workbooks (.xslx files):Choose File/Open from the top menu. The default is to list 'all JMP files', which includes .xlsx files. If there are too many, you can change the file option to the right of the file name box. Excel files will show only .xls, .xlsx, and .xlsm files. The buttons between the file list window and the file name window change how JMP treats row 1 of the data. Does row 1 contain variable names (labels)? The default Best Guess should work for all class files. You can change it to Always or Never if the file doesn't read correctly. Select the desired file (you should have downloaded hamburger.xlsx) and click Open.The next dialog box, shown below, allows you to change how the workbook is read. The top right of this dialog (Worksheets) allows you to change which worksheets to read. You see a preview of what will be read, with the column names,and boxes to change where to read data from.Additional options are available by clicking Next.Since you rarely need to make any changes, click Import.Wilcoxon rank sum test: uses hamburger.csv or hamburger.xlsx Read in the data set. You will note that treatment is a nominal variable (categories, red bars) and cfu is a continuous variable (numbers, blue ramp) by default. If you read a data set and the treatment is a continuous variable, you need to change the modeling type for that variable (described in detail in an earlier lab).Start the same way you would for a t-test: Analyze / Fit Y by X. Treatment in the X, Factor box and cfu in the Y, response box, then click OK.Choose the analysis by clicking the red triangle, select Nonparametric. There are two options for the Wilcoxon test: Wilcoxon test and Exact Wilcoxon Test. The first uses the Normal approximation to the p-value (appropriate for large samples); the second does a permutation test on ranked values (best for small samples). To get the second, choose Wilcoxon Test, then choose Exact test from the Nonparametric menu, then Wilcoxon Exact test.The Wilcoxon Test returns two p-values: The one labeled Chi-square approximation does not use the continuity correction. It is labeled a 1-way test (which will make more sense after our discussion of 1-way ANOVA); it is a two-sample test with a two-tailed p-value. The one labeled Normal Approximation uses the continuity correction. If you additionally request the Wilcoxon Exact Test, the two-sided p-value is labelled Prob ≥|S - Mean|in the 2-sample: Exact Test box. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download