University of California, Santa Cruz



Examples of R Commands for Some Distribution-free Statistical MethodsSign Test (one-group design)library(BSDA)score = c(385, 497, 376, 405, 520, 790, 480, 530, 345, 371, 468, 501, 586, 583, 460)SIGN.test(score, md = 570, alternative = "two.sided", conf.level = 0.95) One-sample Sign-Testdata: scores = 3, p-value = 0.03516alternative hypothesis: true median is not equal to 57095 percent confidence interval: 388.5634 528.2183sample estimates:median of x 480 Conf.Level L.E.pt U.E.ptLower Achieved CI 0.8815 405.0000 520.0000Interpolated CI 0.9500 388.5634 528.2183Upper Achieved CI 0.9648 385.0000 530.0000Mann-Whitney Test (two-group design)group1 = c(32, 39, 26, 35, 43, 27, 40, 37, 34, 29)group2 = c(36, 44, 47, 42, 49, 39, 46, 31, 33, 48)wilcox.test(group1, group2) Wilcoxon rank sum test with continuity correctiondata: group1 and group2W = 20.5, p-value = 0.02831alternative hypothesis: true location shift is not equal to 0Warning message:In wilcox.test.default(group1, group2) : cannot compute exact p-value with tiesKruskal-Wallis Test (single-factor between-subjects design)score = c(12, 10, 11, 14, 15, 9, 11, 12, 13, 10, 15, 10, 12, 13, 17, 11, 16, 13, 12, 14, 17, 12, 16, 18, 14, 21, 17, 16, 17, 22, 16, 22, 19, 20, 18, 16)group = factor(c(1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3))kruskal.test(score ~ group) Kruskal-Wallis rank sum testdata: scores by groupKruskal-Wallis chi-squared = 20.051, df = 2, p-value = 4.426e-05Jonckheere Test (ordered single-factor between-subjects design)library(clinfun)score = c(12, 10, 11, 14, 15, 9, 11, 12, 13, 10, 15, 10, 12, 13, 17, 11, 16, 13, 12, 14, 17, 12, 16, 18, 14, 21, 17, 16, 17, 22, 16, 22, 19, 20, 18, 16)group = c(1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3,3)jonckheere.test(score, group, alternative = "increasing") Jonckheere-Terpstra testdata: JT = 376, p-value = 1.725e-06alternative hypothesis: increasingWilcoxon Sign-rank Test (paired-samples design)y1 = c(21, 4, 9, 12, 35, 18, 10, 22, 24, 1, 6, 8, 13, 16, 19)y2 = c(67, 28, 30, 28, 52, 40, 25, 37, 44, 10, 14, 20, 28, 40, 51)wilcox.test(y1, y2, paired = T)data: y1 and y2V = 0, p-value = 0.0007175alternative hypothesis: true location shift is not equal to 0Warning message:In wilcox.test.default(y1, y2, paired = T) : cannot compute exact p-value with tiesFriedman Test (single-factor within-subjects design)score = c(25, 22, 15, 20, 18, 17, 39, 34, 30, 29, 25, 22, 18, 16, 14, 22, 21, 19, 25, 22, 19, 30, 27, 26)time = factor(c(1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3,1,2,3))subject = factor(c(1,1,1,2,2,2,3,3,3,4,4,4,5,5,5,6,6,6,7,7,7,8,8,8))friedman.test(score ~ time|subject) Friedman rank sum testdata: scores and time and subjectFriedman chi-squared = 16, df = 2, p-value = 0.0003355Page Test (ordered single-factor within-subjects design)library(crank)data = matrix(c(.797,.873,.888,.923,.942,.956, .794,.772,.908,.982,.946,.913, .838,.801,.853,.951,.883,.837, .815,.801,.747,.859,.887,.902), nrow = 4, byrow = T)# data for four participants and a 6-level ordered within-subjects factor page.trend.test(data, ranks = F)Page test for ordered alternativesL = 342 p(table) <=.001Spearman Correlation Testy = c(21, 4, 9, 12, 35, 18, 10, 22, 24, 1, 6, 8, 13, 16, 19)x = c(67, 28, 30, 28, 52, 40, 25, 37, 44, 10, 14, 20, 28, 40, 51)cor.test(x,y, alternative = "two.sided", method = "spearman") Spearman's rank correlation rhodata: x and yS = 72.82, p-value = 2.492e-05alternative hypothesis: true rho is not equal to 0sample estimates: rho 0.8699639Theil-Sen Linear Regression (one predictor variable)library(mblm)y = c(21, 4, 9, 12, 35, 18, 10, 22, 24, 1, 6, 8, 13, 16, 19)x = c(67, 28, 30, 28, 52, 40, 25, 37, 44, 10, 14, 20, 28, 40, 51)out = mblm(y ~ x)summary(out)confint.mblm(out)Residuals: Min 1Q Median 3Q Max -10.5 -3.0 0.0 1.0 11.0 Coefficients: Estimate MAD V value Pr(>|V|) (Intercept) -2.0000 3.4214 17 0.027969 * x 0.5000 0.1483 120 0.000725 ***---Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1Residual standard error: 5.463 on 13 degrees of freedom> confint.mblm(out) 0.025 0.975(Intercept) -7.9953254 -0.6134208x 0.4250361 0.6630420[Output includes several waring messages about inability to compute exact results with tied values]R User-defined Functions for Some Distribution-free Statistical MethodsPart I. Confidence IntervalsConfidence Interval for Median (one-group design)CImedian1 <- function(alpha, y) {# Computes confidence interval for a median# Arguments:# alpha: alpha level for 1-alpha confidence# y: vector of sample data# Returns:# estimated median, standard error, and confidence intervaln <- length(y)y <- sort(y)z <- qnorm(1 - alpha/2)c1 <- round((n - z*sqrt(n))/2)if (c1 < 1) {c1 = 1}if (round(n/2) == n/2) {median = (y[n/2] + y[n/2 + 1])/2}else {median = y[floor(n/2) + 1]}L <- y[c1]U <- y[n - c1 + 1]a <- round((n + 1)/2 - sqrt(n))if (a < 1) {a = 1}L1 <- y[a]U1 <- y[n - a + 1]p <- pbinom(a - 1, size = n, prob = .5)z0 <- qnorm(1 - p)conf <- 1 - 2*pse <- (U1 - L1)/(2*z0)out <- t(c(median, se, L, U, conf))colnames(out) <- c("Median", "SE", "LL", "UL", "Conf Level")return(out)}Example:score = c(385, 497, 376, 405, 520, 790, 480, 530, 345, 371, 468, 501, 586, 583, 460)CImedian1(.05, score) Median SE LL UL Conf Level[1,] 480 34.4164 385 530 0.9648438Confidence Interval for Median Difference (two-group design)CImedian2 <- function(alpha, y1, y2) {# Computes confidence interval for a median difference # in a two-group design# Arguments:# alpha: alpha level for 1-alpha confidence# y1: vector of sample data for group 1# y2: vector of sample data for group 2 # Returns:# estimated median difference, standard error, and confidence intervalz <- qnorm(1 - alpha/2)n1 <- length(y1)y1 <- sort(y1)n2 <- length(y2)y2 <- sort(y2)if (round(n1/2) == n1/2) {median1 = (y1[n1/2] + y1[n1/2 + 1])/2}else {median1 = y1[floor(n1/2) + 1]}if (round(n2/2) == n2/2) {median2 = (y2[n2/2] + y2[n2/2 + 1])/2}else {median2 = y2[floor(n2/2) + 1]} a1 <- round((n1 + 1)/2 - sqrt(n1))if (a1 < 1) {a1 = 1}L1 <- y1[a1]U1 <- y1[n1 - a1 + 1]p = pbinom(a1 - 1, size = n1, prob = .5)z0 = qnorm(1 - p)se1 <- (U1 - L1)/(2*z0)a2 <- round((n2 + 1)/2 - sqrt(n2))if (a2 < 1) {a2 = 1}L2 <- y2[a2]U2 <- y2[n2 - a2 + 1]p = pbinom(a2 - 1, size = n2, prob = .5)z0 = qnorm(1 - p)se2 <- (U2 - L2)/(2*z0)diff = median1 - median2se = sqrt(se1^2 + se2^2)L = diff - z*seU = diff + z*seout <- t(c(diff, se, L, U))colnames(out) <- c("Median1-Median2", "SE", "LL", "UL")return(out)}Example:group1 = c(36, 44, 47, 42, 49, 39, 46, 31, 33, 48)group2 = c(32, 39, 26, 35, 43, 27, 40, 37, 34, 29)CImedian2(.05, group1, group2) Median1-Median2 SE LL UL[1,] 8.5 4.316291 0.04022524 16.95977Confidence Interval for Median Ratio (two-group design)CImedianRatio <- function(alpha, y1, y2) {# Computes confidence interval for a median ratio # in a two-group design# Arguments:# alpha: alpha level for 1-alpha confidence# y1: vector of sample data for group 1# y2: vector of sample data for group 2 # Note: all scores must be > 0# Returns:# estimated median ratio and confidence intervalz <- qnorm(1 - alpha/2)y1 <- log(y1)y2 <- log(y2)n1 <- length(y1)y1 <- sort(y1)n2 <- length(y2)y2 <- sort(y2)if (round(n1/2) == n1/2) {median1 = (y1[n1/2] + y1[n1/2 + 1])/2}else {median1 = y1[floor(n1/2) + 1]}if (round(n2/2) == n2/2) {median2 = (y2[n2/2] + y2[n2/2 + 1])/2}else {median2 = y2[floor(n2/2) + 1]} a1 <- round((n1 + 1)/2 - sqrt(n1))if (a1 < 1) {a1 = 1}L1 <- y1[a1]U1 <- y1[n1 - a1 + 1]p = pbinom(a1 - 1, size = n1, prob = .5)z0 = qnorm(1 - p)se1 <- (U1 - L1)/(2*z0)a2 <- round((n2 + 1)/2 - sqrt(n2))if (a2 < 1) {a2 = 1}L2 <- y2[a2]U2 <- y2[n2 - a2 + 1]p = pbinom(a2 - 1, size = n2, prob = .5)z0 = qnorm(1 - p)se2 <- (U2 - L2)/(2*z0)dif = median1 - median2se = sqrt(se1^2 + se2^2)L = exp(dif - z*se)U = exp(dif + z*se)out <- t(c(exp(dif), L, U))colnames(out) <- c("Median1/Median2", "LL", "UL")return(out)}Example:group1 = c(36, 44, 47, 42, 49, 39, 46, 31, 33, 48)group2 = c(32, 39, 26, 35, 43, 27, 40, 37, 34, 29)CImedianRatio(.05, group1, group2) Median1/Median2 LL UL[1,] 1.246171 0.9887032 1.570685Confidence Interval for Linear Contrast of Medians (multiple-group design)CImedianCon <- function(alpha, m, se, c) { # Computes confidence interval and test statistic for a linear contrast of means # Arguments: # alpha: alpha level for 1-alpha confidence # m: vector of sample medians # se: vector of standard errors # c: vector of contrast coefficients # Returns: # estimate of contrast, SE, z-value, p-value, and confidence interval # Note: # compute standard error for each median using the CImedian1 function z <- qnorm(1 - alpha/2) est <- t(c)%*%m k <- length(m) v <- diag(se^2) se <- sqrt(t(c)%*%v%*%c) t <- est/se p <- 2*(1 - pnorm(abs(t))) ll <- est - z*se ul <- est + z*se out <- t(c(est, se, t, p, ll, ul)) colnames(out) <- c("Estimate", "SE", "t", "p-value", "LL", "UL") return(out)}Example:m = c(33.5, 37.9, 38.0, 44.1)se = c(1.84, 1.84, 1.65, 1.98)c = c(.5, .5, -.5, -.5)CImedianCon(.05, m, se, c) Estimate SE t p-value LL UL[1,] -5.35 1.831263 -2.921481 0.00348372 -8.93921 -1.76079Confidence Interval for Spearman CorrelationCISpearman <- function(alpha, y, x) {# Computes confidence interval for Spearman correlation (3/15/2016)# Args:# alpha: alpha level for 1-alpha confidence# y: vector of y scores # x: vector of x scores # Returns:# estimated Spearman correlation, standard error, and confidence interval z <- qnorm(1 - alpha/2)n <- length(y)yr <- rank(y)xr <- rank(x)cor <- cor(yr, xr)se <- sqrt((1 + cor^2/2)*(1 - cor^2)^2/(n - 3))z.cor <- log((1 + cor)/(1 - cor))/2ll0 <- z.cor - z*se/(1 - cor^2)ul0 <- z.cor + z*se/(1 - cor^2)ll <- (exp(2*ll0) - 1)/(exp(2*ll0) + 1)ul <- (exp(2*ul0) - 1)/(exp(2*ul0) + 1)out <- cbind(cor, se, ll, ul)colnames(out) <- c("Estimate", "SE", "LL", "UL")return (out)}Example:y = c(21, 4, 9, 12, 35, 18, 10, 22, 24, 1, 6, 8, 13, 16, 19)x = c(67, 28, 30, 28, 52, 40, 25, 37, 44, 10, 14, 20, 28, 40, 51)CISpearman(.05, y, x) Estimate SE LL UL[1,] 0.8699639 0.08241326 0.5840951 0.9638297Confidence Interval for Difference and Average of Two Spearman Correlations (two-group design)CISpearman2 <- function(alpha, cor1, cor2, n1, n2) {# Computes confidence interval for difference and average# of Spearman correlations estimated from two samples# Args:# alpha: alpha level for 1-alpha confidence# cor1: Spearman correlation estimate from group 1 # cor2: Spearman correlation estimate from group 2# n1: sample size in group 1 # n2: sample size in group 2 # Returns:# estimated average and difference of Spearman correlation# and confidence intervals z <- qnorm(1 - alpha/2)se1 <- sqrt((1 + cor1^2/2)*(1 - cor1^2)^2/(n1 - 3))se2 <- sqrt((1 + cor2^2/2)*(1 - cor2^2)^2/(n2 - 3))z.cor1 <- log((1 + cor1)/(1 - cor1))/2z.cor2 <- log((1 + cor2)/(1 - cor2))/2ll01 <- z.cor1 - z*se1/(1 - cor1^2)ul01 <- z.cor1 + z*se1/(1 - cor1^2)LL1 <- (exp(2*ll01) - 1)/(exp(2*ll01) + 1)UL1 <- (exp(2*ul01) - 1)/(exp(2*ul01) + 1)ll02 <- z.cor2 - z*se2/(1 - cor2^2)ul02 <- z.cor2 + z*se2/(1 - cor2^2)LL2 <- (exp(2*ll02) - 1)/(exp(2*ll02) + 1)UL2 <- (exp(2*ul02) - 1)/(exp(2*ul02) + 1)cor.dif <- cor1 - cor2cor.ave <- (cor1 + cor2)/2LL.dif <- cor.dif - sqrt((cor1 - LL1)^2 + (UL2 - cor2)^2)UL.dif <- cor.dif + sqrt((UL1 - cor1)^2 + (cor2 - LL2)^2)se.ave <- sqrt(se1^2 + se2^2)/4z.ave <- log((1 + cor.ave)/(1 - cor.ave))/2ll03 <- z.ave - z*se.ave/(1 - cor.ave^2)ul03 <- z.ave + z*se.ave/(1 - cor.ave^2)LL.ave <- (exp(2*ll03) - 1)/(exp(2*ll03) + 1)UL.ave <- (exp(2*ul03) - 1)/(exp(2*ul03) + 1)out1 <- cbind(cor.dif, LL.dif, UL.dif)out2 <- cbind(cor.ave, LL.ave, UL.ave)out <- rbind(out1, out2)colnames(out) <- c("Estimate", "LL", "UL")rownames(out) <- c("Difference:", "Average:")return (out)}Example:CISpearman2(.05, .54, .48, 180, 200) Estimate LL ULDifference: 0.06 -0.1003977 0.2185085Average: 0.51 0.4691059 0.5487118Confidence Interval for Mann-Whitney Parameter (two-group design)CIMann <- function(alpha, y1, y2){ # Compute confidence interval for probability of # scoring higher under treatment 1 than treatment 2 # Arguments: # alpha: alpha level for 1-alpha confidence # y1: vector of scores for group 1 # y2: vector of scores for group 2 # Returns: # confidence interval z <- qnorm(1 - alpha/2) y <- c(y1,y2) n1 <- length(y1) n2 <- length(y2) n <- n1 + n2 r <- rank(y) r1 <- r[1:n1] r2 <- r[(n1 + 1):n] m1 <- mean(r1) m2 <- mean(r2) seq1 <- seq(1,n1,1) seq2 <- seq(1,n2,1) a1 <- sum((r1 - seq1)^2) a2 <- sum((r2 - seq2)^2) v1 <- (a1 - n1*(m1 - (n1 + 1)/2)^2)/((n1 - 1)*n^2) v2 <- (a2 - n2*(m2 - (n2 + 1)/2)^2)/((n2 - 1)*n^2) U <- sum(r2) - n2*(n2 + 1)/2 est <- U/(n1*n2) se <- sqrt((n2*v1 + n1*v2)/(n1*n2)) LL <- 1 - est - z*se UL <- 1 - est + z*se if (UL > 1) {UL = 1} if (LL < 0) {LL = 0} out <- c(LL, UL) return(out)}Example:group1 = c(36, 44, 47, 42, 49, 39, 46, 31, 33, 48)group2 = c(32, 39, 26, 35, 43, 27, 40, 37, 34, 29)CIMann(.05, group1, group2)[1] 0.5202456 1.0000000Confidence Interval for Proportion (one-group or paired-samples design)CIprop1 <- function(alpha, f, n) { # Computes Agresti-Coull confidence interval for a population proportion # Arguments: # alpha: alpha level for 1-alpha confidence # f: number of participants with score greater than # median (one-group design) or y1 score greater than # y2 score (paired-samples design) # n: sample size # Returns: # confidence interval z <- qnorm(1 - alpha/2) p <- (f + 2)/(n + 4) se <- sqrt(p*(1 - p)/(n + 4)) LL <- p - z*se UL <- p + z*se if (UL > 1) { UL = 1 } if (LL < 0) { LL = 0 } CI <- c(LL, UL) return(CI)}Example:CIprop1(.05, 88, 100)[1] 0.7997877 0.9309815Confidence Interval for Mean Absolute Deviation (one-group design)CImad1 <- function(alpha, y) {# Computes ADF confidence interval for a MAD# Arguments:# alpha: alpha level for 1-alpha confidence# y: vector of sample data# Returns:# estimated MAD and confidence intervaln <- length(y)z <- qnorm(1 - alpha/2)c <- n/(n - 1)y <- sort(y)if (round(n/2) == n/2) {median = (y[n/2] + y[n/2 + 1])/2}else {median = y[floor(n/2) + 1]}mad <- mean(abs(y - median))skw <- (mean(y) - median)/mad kur <- (sd(y)/mad)^2se <- sqrt((skw^2 + kur - 1)/n)L <- exp(log(c*mad) - z*se)U <- exp(log(c*mad) + z*se)out <- t(c(mad, L, U))colnames(out) <- c("MAD", "LL", "UL")return(out)}Example:score = c(30, 20, 15, 10, 10, 60, 20, 25, 20, 30, 10, 5, 50, 40, 20, 10, 0, 20, 50)CImad1(.05, score) MAD LL UL[1,] 11.84211 7.962667 19.62282Confidence Interval for Ratio of Mean Absolute Deviations (two-group design)CImadRatio <- function(alpha, y1, y2) {# Computes ADF confidence interval for a ratio of MADs from # a two-group design# Arguments:# alpha: alpha level for 1-alpha confidence# y1: vector of sample data for group 1# y2: vector of sample data for group 2# Returns:# estimated MAD1/MAD2 and confidence intervalz <- qnorm(1 - alpha/2)n1 <- length(y1)c1 <- n1/(n1 - 1)y1 <- sort(y1)n2 <- length(y2)c2 <- n2/(n2 - 1)y2 <- sort(y2)if (round(n1/2) == n1/2) {median1 = (y1[n1/2] + y1[n1/2 + 1])/2}else {median1 = y1[floor(n1/2) + 1]}mad1 <- mean(abs(y1 - median1))skw1 <- (mean(y1) - median1)/mad1 kur1 <- (sd(y1)/mad1)^2var1 <- (skw1^2 + kur1 - 1)/n1if (round(n2/2) == n2/2) {median2 = (y2[n2/2] + y2[n2/2 + 1])/2}else {median2 = y2[floor(n2/2) + 1]}mad2 <- mean(abs(y2 - median2))skw2 <- (mean(y2) - median2)/mad2 kur2 <- (sd(y2)/mad2)^2var2 <- (skw2^2 + kur2 - 1)/n2se <- sqrt(var1 + var2)c <- c1/c2est <- mad1/mad2L <- exp(log(c*est) - z*se)U <- exp(log(c*est) + z*se)out <- t(c(mad1/mad2, L, U))colnames(out) <- c("MAD1/MAD2", "LL", "UL")return(out)}Example:group1 = c(32, 39, 26, 35, 43, 27, 40, 37, 34, 29)group2 = c(36, 44, 47, 42, 49, 39, 46, 31, 33, 48)CImadRatio(.05, group1, group2) MAD1/MAD2 LL UL[1,] 0.8679245 0.4520879 1.666253Confidence Interval for Ratio of Mean Absolute Deviations (paired-samples design)CImadRatioWS <- function(alpha, y1, y2) {# Computes ADF confidence interval for a ratio of MADs# from a paired-samples (within-subjects) design# Arguments:# alpha: alpha level for 1-alpha confidence# y1: vector of measurement 1 scores# y2: vector of measurement 2 scores # Returns:# estimated MAD1/MAD2 and confidence intervalz <- qnorm(1 - alpha/2)n1 <- length(y1); n2 <- length(y2)ys1 <- sort(y1); ys2 <- sort(y2)if (round(n1/2) == n1/2) {median1 = (ys1[n1/2] + ys1[n1/2 + 1])/2}else {median1 = ys1[floor(n1/2) + 1]}mad1 <- mean(abs(y1 - median1))skw1 <- (mean(y1) - median1)/mad1 kur1 <- (sd(y1)/mad1)^2var1 <- (skw1^2 + kur1 - 1)/n1if (round(n2/2) == n2/2) {median2 = (ys2[n2/2] + ys2[n2/2 + 1])/2}else {median2 = ys2[floor(n2/2) + 1]}mad2 <- mean(abs(y2 - median2))skw2 <- (mean(y2) - median2)/mad2 kur2 <- (sd(y2)/mad2)^2var2 <- (skw2^2 + kur2 - 1)/n2d1 <- abs(y1 - median1); d2 <- abs(y2 - median2)cor <- cor(d1, d2)se <- sqrt(var1 + var2 - 2*cor*sqrt(var1*var2))est <- mad1/mad2L <- exp(log(est) - z*se)U <- exp(log(est) + z*se)out <- t(c(mad1/mad2, L, U))colnames(out) <- c("MAD1/MAD2", "LL", "UL")return(out)}Example:score2 = c(21, 4, 9, 12, 35, 18, 10, 22, 24, 1, 6, 8, 13, 16, 19)score1 = c(67, 28, 30, 28, 52, 40, 25, 37, 44, 10, 14, 20, 28, 40, 51)CImadRatioWS(.05, score1, score2) MAD1/MAD2 LL UL[1,] 1.695238 1.109176 2.590961 Confidence Interval for Coefficient of Dispersion (one-group design)CIcod1 <- function(alpha, y){ # Computes ADF confidence interval for a coefficient of # dispersion (MAD/median) # Arguments: # alpha: alpha level for 1-alpha confidence # y: vector of scores # Returns: # estimated COD and confidence interval z <- qnorm(1 - alpha/2) n <- length(y) c <- n/(n - 1) a1 <- round((n + 1)/2 - sqrt(n)) b1 <- n - a1 + 1 median <- median(y) m <- mean(y) v <- var(y) mad <- mean(abs(y - median)) skw <- (m - median)/mad kur <- v/mad^2 cod <- mad/median y <- sort(y) varlogmed <- ((log(y[a1]) - log(y[b1]))/4)^2 selogmed <- sqrt(varlogmed) varlogmad <- (kur + skw^2 - 1)/n selogmad <- sqrt(varlogmad) cov <- skw*selogmed/sqrt(n) k <- sqrt(varlogmed + varlogmad - 2*cov)/(selogmed + selogmad) a2 <- round((n + 1)/2 - k*z*sqrt(n/4)) b2 <- n - a2 + 1 L2 <- log(y[a2]) U2 <- log(y[b2]) L1 <- log(c*mad) - k*z*selogmad U1 <- log(c*mad) + k*z*selogmad L <- exp(L1 - U2) U <- exp(U1 - L2) out <- t(c(cod, L, U)) colnames (out) <- c("COD", "LL", "UL") return(out) }Example:score = c(30, 20, 15, 10, 10, 60, 20, 25, 20, 30, 10, 5, 50, 40, 20, 10, 0, 20, 50)CIcod1(.05, score) COD LL UL[1,] 0.5921053 0.3813259 1.092679Confidence Interval for Ratio of Coefficients of Dispersion (two-group design)CIcodRatio <- function(alpha, y1, y2){ # Computes ADF confidence interval for COD1/COD2 # Arguments: # alpha: alpha level for 1-alpha confidence # y1: vector of scores for group 1 # y2: vector of scores for group 2 # Returns: # estimated COD1/COD2 and confidence interval z <- qnorm(1 - alpha/2) n1 <- length(y1); c1 <- n1/(n1 - 1) n2 <- length(y2); c2 <- n2/(n2 - 1) a11 <- round((n1 + 1)/2 - sqrt(n1)); b11 <- n1 - a11 + 1 median1 <- median(y1); median2 <- median(y2) m1 <- mean(y1); m2 <- mean(y2) v1 <- var(y1); v2 <- var(y2) mad1 <- mean(abs(y1 - median1)); mad2 <- mean(abs(y2 – median2)) skw1 <- (m1 - median1)/mad1; skw2 <- (m2 - median2)/mad2 kur1 <- v1/mad1^2; kur2 <- v2/mad2^2 cod1 <- mad1/median1; cod2 <- mad2/median2 y1 <- sort(y1); y2 <- sort(y2) selogmed1 <- sqrt(((log(y1[a11]) - log(y1[b11]))/4)^2) selogmad1 <- sqrt((kur1 + skw1^2 - 1)/n1) cov1 <- skw1*selogmed1/sqrt(n1) k1 <- sqrt(selogmed1^2 + selogmad1^2 - 2*cov1)/(selogmed1 + selogmad1) a21 <- round((n1 + 1)/2 - k1*z*sqrt(n1/4)) b21 <- n1 - a21 + 1 L2 <- log(y1[a21]); U2 <- log(y1[b21]) L1 <- log(c1*mad1) - k1*z*selogmad1 U1 <- log(c1*mad1) + k1*z*selogmad1 L1 <- L1 - U2; U1 <- U1 - L2 a12 <- round((n2 + 1)/2 - sqrt(n2)); b12 <- n2 - a12 + 1 selogmed2 <- sqrt(((log(y2[a12]) - log(y2[b12]))/4)^2) selogmad2 <- sqrt((kur2 + skw2^2 - 1)/n2) cov2 <- skw2*selogmed2/sqrt(n2) k2 <- sqrt(selogmed2^2 + selogmad2^2 - 2*cov2)/(selogmed2 + selogmad2) a22 <- round((n2 + 1)/2 - k2*z*sqrt(n2/4)) b22 <- n2 - a22 + 1 L2 <- log(y2[a22]); U2 <- log(y2[b22]) L1 <- log(c2*mad2) - k2*z*selogmad2 U1 <- log(c2*mad2) + k2*z*selogmad2 L2 <- L1 - U2; U2 <- U1 - L2 L <- exp(log(cod1/cod2) - sqrt((log(cod1) - L1)^2 + (U2 - log(cod2))^2)) U <- exp(log(cod1/cod2) + sqrt((U1 - log(cod1))^2 + (log(cod2) - L2)^2)) out <- t(c(cod1/cod2, L, U)) colnames (out) <- c("COD", "LL", "UL") return(out) }Example:score1 = c(30, 20, 15, 10, 10, 60, 20, 25, 20, 30, 10, 5, 50, 40, 20, 10, 0, 20, 50)score2 = c(3, 2, 2, 1, 1, 6, 2, 3, 2, 3, 1, 1, 5, 4, 2, 1, 0, 2, 5)CIcodRatio(.05, score1, scoere2) COD LL UL[1,] 1.022727 0.1641802 2.815181Part 2. Sample Size Approximations for Desired PowerSign Test (one-group design)sizePOWsign1 <- function(alpha, p, pow) { # Computes sample size required for Sign test with desired power # in a one-group design # Arguments: # alpha: alpha level for test # p: planning value of proportion of cases with score # greater than hypothesized value of median # pow: desired power # Returns: # required sample size za <- qnorm(1 - alpha/2) zb <- qnorm(pow) es <- p - .5; n <- ceiling((sqrt(p*(1 - p))*za + .5*zb)^2/es^2) return(n)}Example:sizePOWsign1(.05, .3, .9)[1] 560Mann-Whitney Test (two-group design)sizePOWmann <- function(alpha, p, pow) { # Computes sample size required for Mann-Whitney test with desired power # Arguments: # alpha: alpha level for test # p: planning value of proportion of cases with scores that would be # higher under treatment 1 than treatment 2 # pow: desired power # Returns: # required sample size per group za <- qnorm(1 - alpha/2) zb <- qnorm(pow) es <- p - .5; n <- ceiling((za + zb)^2/(6*es^2)) return(n)}Example:sizePOWmann(.05, .3, .9)[1] 44Sign Test (paired-samples design)sizePOWsignWS <- function(alpha, p, pow) { # Computes sample size required for Sign test with desired power # in a paired-samples design # Arguments: # alpha: alpha level for test # p: planning value of proportion of cases with scores that # would be higher under treatment 1 than treatment 2 # pow: desired power # Returns: # required sample size za <- qnorm(1 - alpha/2) zb <- qnorm(pow) es <- p - .5; n <- ceiling(p*(1 - p)*(za + zb)^2/es^2) return(n)}Example:sizePOWsignWS(.05, .75, .9)[1] 32Part 3. Sample Size Approximations for Desired PrecisionMedian (one-group design)sizeCImedian1 <- function(alpha, var, shape, w) { # Computes sample size required to estimate a median # with desired precision in a 1-group design # Arguments: # alpha: alpha level for 1-alpha confidence # var: planning value of DV variance # w: desired confidence interval width # Note: # Set shape to value between 1 and 1.57; e.g. 1 for extreme # skew, 1.3 for mild skew, and 1.57 for normal distribution # Returns: # required sample size z <- qnorm(1 - alpha/2) n <- ceiling(4*shape*var*(z/w)^2 + z^2/2) return(n)}Example:sizeCImedian1(.05, 250, 1.3, 10)[1] 52Median Difference (two-group design)sizeCImedian2 <- function(alpha, var, shape, w) { # Computes sample size required to estimate a difference in # medians with desired precision in a 2-group design # Arguments: # alpha: alpha level for 1-alpha confidence # var: average planning value of DV variance # w: desired confidence interval width # Note: # Set shape to value between 1 and 1.57; e.g. 1 for extreme # skew, 1.3 for mild skew, and 1.57 for normal distribution # Returns: # required sample size z <- qnorm(1 - alpha/2) n <- ceiling(8*shape*var*(z/w)^2 + z^2/2) return(n)}Example:sizeCImedian2(.05, 144, 1.0, 8)[1] 72Spearman CorrelationsizeCISpear <- function(alpha, cor, w) { # Computes sample size required to estimate a Spearman # correlation with desired precision # Arguments: # alpha: alpha level for 1-alpha confidence # cor: correlation planning value # w: desired confidence interval width # Returns: # confidence interval z <- qnorm(1 - alpha/2) n <- ceiling(4*(1 + cor^2/2)*(1 - cor^2)^2*(z/w)^2) return(n)} Example:sizeCISpear(.05, .5, .4)[1] 61Proportion (one-group or paired-samples design)sizeCIprop <- function(alpha, p, w) { # Computes sample size required to estimate a proportion # with desired precision # Arguments: # alpha: alpha level for test # p: planning value of proportion # w: desired confidence interval width # Returns: # required sample size z <- qnorm(1 - alpha/2) n <- ceiling(4*p*(1 - p)*(z/w)^2) return(n)}Example:sizeCIprop(.05, .75, .2)[1] 73Mean Absolute Deviation (one-group design)sizeCImad <- function(alpha, kur, skw, r) { # Computes sample size required to estimate a MAD # with desired precision # Arguments: # alpha: alpha level for test # kur: planning value of var/mad^2 # skw: planning value of (mean – median)/mad # r: desired upper to lower confidence interval # endpoint ratio # Note: # Set kur to value between 1.3 and 2.0; e.g. 1.3 for # flat distribution, 1.57 for normal, and 2.0 for long-tail # distribution # Set skw to value between 0 and .5; e.g., 0 for # symmetric, .25 for mild skew and 0.5 for extreme skew # Returns: # required sample size z <- qnorm(1 - alpha/2) n <- ceiling(4*(skw^2 + kur - 1)*(z/log(r))^2) return(n)}Example:sizeCImad(.05, 2, 0, 1.5)[1] 94Mean Absolute Deviation Ratio (two-group design)sizeCImadRatio <- function(alpha, kur, skw, r) { # Computes sample size required per group to estimate MAD1/MAD2 # with desired precision # Arguments: # alpha: alpha level for test # kur: planning value of average var/mad^2 # skw: planning value of average (mean – median)/mad # r: desired upper to lower confidence interval # endpoint ratio # Note: # Set kur to value between 1.3 and 2.0; e.g. 1.3 for # flat distribution, 1.57 for normal, and 2.0 for long-tail # distribution # Set skw to value between 0 and .5; e.g., 0 for symmetric, # 25 for mild skew and .5 for extreme skew # Returns: # required sample size z <- qnorm(1 - alpha/2) n <- ceiling(8*(skw^2 + kur - 1)*(z/log(r))^2) return(n)}Example:sizeCImad(.05, 2, 0, 1.5)[1] 188Second-stage sample size requirement sizeCIsecond <- function(n0, w0, w) { # Computes second-stage sample size required to obtain desired precision # Arguments: # n0: first-stage sample size # w0: CI width in first-stage sample # w: desired confidence interval width # Returns: # required second-stage sample size to be combined with the # first-stage sample n <- ceiling(((w0/w)^2 - 1)*n0)return(n)}Example:sizeCIsecond(20, 5.3, 2.5)[1] 70Bar Chart of Samples Medians with Confidence Interval Lines [Note: Enter the group label, sample median, lower limit, and upper limit for each group]library(ggplot2)data <- read.table(text = "group median LL ULTreatment_A 7.43 5.03 9.83Treatment_B 2.72 0.62 4.82Treatment_C 5.27 2.97 7.57", header = T)ggplot(data, aes(x = group, y = median)) + geom_bar(stat = "identity", position = position_dodge()) + geom_errorbar(aes(ymin = LL, ymax = UL), width = .25)Suggested textsGibbons, J.D. (2003). Nonparametric statistical inference, 4th ed. New York: Dekker.Hollander, M. & Wolf, D.A. (1999). Nonparametric statistical methods, 2nd ed. New York: Wiley.Siegel, S. & Castellan, N.J. (1988). Nonparametric statistics for the behavioral sciences. New York: Mcgraw-Hill.References for some of the proposed methods Bonett, D.G. & Wright, T.A. (2000) Sample size requirements for estimating Pearson, Kendall, and Spearman correlations. Psychometrika, 65, 23-28.Bonett, D.G. & Price, R.M. (2002). Statistical inference for linear functions of medians: Confidence intervals, hypothesis testing, and sample size requirements. Psychological Methods, 7, 370-383.Bonett, D.G. & Seier, E. (2003). Statistical inference for a ratio of dispersions using paired-samples. Journal of Educational and Behavioral Statistics, 28, 21-30. Bonett, D.G. & Seier, E. (2003). Confidence intervals for mean absolute deviations. The American Statistician, 57, 233-236.Bonett, D.G. & Seier, E. (2006). Confidence interval for a coefficient of dispersion in nonnormal distributions. Biometrical Journal, 48, 144-148.Price, R.M. & Bonett, D.G. (2002). Distribution-free confidence intervals for difference and ratio of medians. Journal of Statistical Computation and Simulation, 72, 119-124. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download