The Case of the Sleepy Students



The Case of the Sleepy StudentsStephen L Cowen2017The case study:We are going to look at a drug administered to some students (hypothetical data). We want to determine if the drug was effective at increasing sleep ics covered:The practice datasets in R - there is one on Titanic survivors and many more.Tests for normality and plots that help assess normality.How to make error bar plots using different CIs (the most common plot ever)3 sexy plots that are specialized for paired data.Interpreting t-tests and the output (p, t, CIs).Effect sizes for within-subject t-testsHow to write up the results.Required Reading:Navarro's chapter on Chi-square. (read it 2x)**Let's GO:**First, call some required libraries...library('knitr') # Kable for pretty tables. Not necessary for statistics or plotting.library('tidyverse') ## Loading tidyverse: ggplot2## Loading tidyverse: tibble## Loading tidyverse: tidyr## Loading tidyverse: readr## Loading tidyverse: purrr## Loading tidyverse: dplyr## Conflicts with tidy packages ----------------------------------------------## filter(): dplyr, stats## lag(): dplyr, statsrm(list = ls()) # this clears variables in case any are already loaded. Let's look for and load the dataset...data() # Shows you all of the sample databases availble for exploration...?sleep # This gets you information on the database.The Sleep Drug DatasetThe following comes from ?sleepData which show the effect of two soporific drugs (increase in hours of sleep compared to control) on 10 patients. A data frame with 20 observations on 3 variables.[, 1] extra numeric increase in hours of sleep [, 2] group factor drug given [, 3] ID factor patient IDNOTE: This is a within-subject design.This typically means multiple samples per subject under different conditions. This could be considered a random effect as well when we talk about mixed models.Goal 1: clearly state the hypothesis: H1: Drugs coded as 2 will increase the mean sleep time in subjects. H0: There will be no difference between groups.Goal 2: plot the distributions. Do some tests for normality.Tidy your data: Always the first thing after loadingRename data to have more appropriate column names/levels and convert to factors.head(sleep) # kable is from knitr library - for prettier tables. ## extra group ID## 1 0.7 1 1## 2 -1.6 1 2## 3 -0.2 1 3## 4 -1.2 1 4## 5 -0.1 1 5## 6 3.4 1 6That was easy! Just use the data function to load any of the data() databases into R.names(sleep) <- c("ExtraHours", "Treatment", "SubjectID")Rename levels.sleep$Treatment = factor(sleep$Treatment,c(1,2), labels = c('Placebo', 'Sleepomogorazam'))head(sleep)## ExtraHours Treatment SubjectID## 1 0.7 Placebo 1## 2 -1.6 Placebo 2## 3 -0.2 Placebo 3## 4 -1.2 Placebo 4## 5 -0.1 Placebo 5## 6 3.4 Placebo 6I want the histograms to look good - that they have big text and I am a little tired of the gray. The following function changes things. Try playing with different themes. The following theme has no grid lines: I like this. Clean. base_size increases FONT SIZE. Small font sizes are a big pet peeve of mine.theme_set(theme_classic(base_size = 15)) Always good to plot histograms to see if things look normal and spot outliers.ggplot(sleep,aes(x = ExtraHours, color = Treatment, fill = Treatment)) + geom_histogram(binwidth = 1, alpha = .5) + geom_rug()# I addded the geom_rug() to show off. It's actually nice because it shows the# actual data on the x axis as tic marks. Fancy! It helps you figure out if your# bin width was OK.To be super sure that data is normally distributed, let's plot the qqplots.qqnorm(sleep$ExtraHours[sleep$Treatment == "Placebo"])qqline(sleep$ExtraHours[sleep$Treatment == "Placebo"], col = "red")qqnorm(sleep$ExtraHours[sleep$Treatment == "Sleepomogorazam"])qqline(sleep$ExtraHours[sleep$Treatment == "Sleepomogorazam"], col = "red")Since we love ggplot because it makes pub-quality plots, let's try this in ggplot...ggplot(sleep[sleep$Treatment == "Sleepomogorazam",], aes(sample=ExtraHours)) + stat_qq(size = 3)Sadly, this left me unsatisfied. It works, but not as nice as qqnorm + qqline.Now let's do the test for normality.shapiro.test(sleep$ExtraHours[sleep$Treatment == "Placebo"])## ## Shapiro-Wilk normality test## ## data: sleep$ExtraHours[sleep$Treatment == "Placebo"]## W = 0.92581, p-value = 0.4079shapiro.test(sleep$ExtraHours[sleep$Treatment == "Sleepomogorazam"])## ## Shapiro-Wilk normality test## ## data: sleep$ExtraHours[sleep$Treatment == "Sleepomogorazam"]## W = 0.9193, p-value = 0.3511We are good! We don't have much data but it's always good to plot a qqplot.Awesome plots for paired tests!Before showing you how to plot paired data the 'right' way, let's plot the old-school way with SEM confidence intervals...library(Hmisc) # This has the stat_summary functions for calculating errors: mean_se and mean_boot_ciggplot(sleep, aes(x = Treatment, y = ExtraHours)) + stat_summary(fun.data=mean_se, geom = "errorbar",size = 2,width=0.2) + ggtitle("Effect of Drug on Sleep") or for bars with errorbars... (the most common plot on the planet)ggplot(sleep,aes(x = Treatment, y = ExtraHours, fill = Treatment)) + stat_summary(fun.y = mean, geom = "bar") + stat_summary(fun.data=mean_se, size = 1, geom = "errorbar",width=0.2) + ggtitle("Effect of Drug on Sleep") + theme(legend.position="none") # the theme() at the end got rid of the useless legend. Recall that SEM indicates 1sd around the sampling distribution (68% of random samples from the 'true' population should fall within this range.). If you wanted the 95% confidence interval you would use the mean_cl_normal function...ggplot(sleep,aes(x = Treatment, y = ExtraHours, fill = Treatment)) + stat_summary(fun.y = mean, geom = "bar") + stat_summary(fun.data=mean_cl_normal, size = 1, geom = "errorbar",width=0.2) + ggtitle("Effect of Drug on Sleep. 95% SEM") + theme(legend.position="none") Or, take the SEM and multiply by 1.96. Do you remember why?You should see that the plot above and below are about the same.ggplot(sleep,aes(x = Treatment, y = ExtraHours, fill = Treatment)) + stat_summary(fun.y = mean, geom = "bar") + stat_summary(fun.data=mean_se, fun.args = list(mult = 1.96), size = 1, geom = "errorbar",width=0.2) + ggtitle("Effect of Drug on Sleep. 95% SEM") + theme(legend.position="none") + geom_jitter(width = .3, size = 4)# I added the jitter at the end so that we can see all of the data - typically a great idea.# Finally, if you want bootstrap_cis (95%), use the code below. BEFORE running the code, do you know how bootstrapping works? It's a form of resampling where you pick random samples from your sample (with replacement), compute the mean from each sample, and then repeat 1000s of times. The spread of this distribtion (the 2.5 and 97.5 quantiles) is the boot confidence interval. NOTE: you do not need to divide by sqrt(n) as you do with SEM because you are actually creating a sampling distribution based on the sample size (because you are only picking samples of size n on each of the 1000s of samples.)Bootstrap CIs are nice as it is not as dependent on the assumption of normality.ggplot(sleep,aes(x = Treatment, y = ExtraHours, fill = Treatment)) + stat_summary(fun.y = mean, geom = "bar") + stat_summary(fun.data=mean_cl_boot, size = 1, geom = "errorbar",width=0.2) + ggtitle("Effect of Drug on Sleep. Bootstrapped 95% CI") + theme(legend.position="none") The hard truth: I think that researchers like SEM (1 sd around the mean) instead of the 95% CI because it makes their effects look bigger.Note: the difference does not look that impressive with 95% CIs. This is partly because the CIs are not really relevant for within subject data when the data are plotted as independent bars because the data are actually dependent.Paired Line PlotFor those of you familiar with paired tests (same as within subject or 'random effects'), you might have seen the paired line plots that connect conditions between subjects. This is a GREAT way to view paired results. If the lines shift up or down on average you have a result (probably)! This is more powerful than looking at error bars as error bars do not show within subject differences. Really, error bars are typically not the best way to show paired results, but sometimes I use error bars if I am making tons of comparisons and don't have room for the following plots on the figure.ggplot(sleep,aes(x = Treatment, y = ExtraHours)) + geom_line(aes(group = SubjectID), size = 2) + geom_point(size = 6)Paired Scatter Plot with DiagonalAnother completely valid way to plot paired results is to do a scatter plot. How? Treat Condition 1 as the x axis and Condition 2 as the y axis and plot each subject as a point. If there is an effect, points should cluster above or below the diagonal.To do this, we need to spread the dataframe out to WIDE format. You will need to know spread and gather from tidyverse. They can be confusing (described in the O'Reilly books) but suuuuper useful.sleep_wide = spread(sleep,Treatment,ExtraHours) # Create columns for Treatement that Contains ExtraHourskable(sleep_wide)SubjectIDPlaceboSleepomogorazam10.71.92-1.60.83-0.21.14-1.20.15-0.1-0.163.44.473.75.580.81.690.04.6102.03.4Honestly this is a MUCH more intuitive way to represent paired data, but ggplot and some stats tests do not work when data is in wide format. Some do though (like t.test).As an exercise, let's convert sleep_wide back to long - just to show you the gather() function.long_dataframe = gather(sleep_wide,ExtraHours, ExtraHours, -SubjectID) # You need the -SubjectID so that it does NOT gather this column as we want to keep this column.Here is the PLOT, finallylims <- c(min(sleep$ExtraHours, na.rm = T), max(sleep$ExtraHours, na.rm = T))ggplot(sleep_wide,aes(x = Placebo, y = Sleepomogorazam)) + geom_point(size = 4) + geom_abline() + theme(aspect.ratio = 1) + scale_x_continuous("Placebo", limits = lims) + scale_y_continuous("Sleepomogorazam", limits = lims) lims was required to ensure the x and y axis had the same length. abline() made the diagonal line. theme() ensured that the box was square.Look: All the points are above the diagonal. More evidence that our result is significant.Histogram of differences.qplot(sleep_wide$Sleepomogorazam - sleep_wide$Placebo, geom = "histogram", binwidth = 1, fill = "purple") + xlab("difference") + theme(legend.position="none") + geom_vline(xintercept=0)The last part gets rid of the legend and plots a vertical line at zero.Hypothesis testing:Congrats. You have learned every major way of plotting data for a 2-column paired test. Your choice dependes on your preference and the data. Now that we have visualized the data, let's do the tests to determine if the drug works.We learned that the data was normal enough so we'll use a paired t-test.t = t.test(sleep_wide$Placebo ,sleep_wide$Sleepomogorazam, paired = TRUE )We could have done the same with the original data frame...t = t.test(sleep$ExtraHours[sleep$Treatment == "Placebo"] ,sleep$ExtraHours[sleep$Treatment == "Sleepomogorazam"], paired = TRUE )POWER TIP: A paired test is the same as a one-sample test on the diffrences.t = t.test(sleep_wide$Placebo-sleep_wide$Sleepomogorazam,paired = FALSE)Q: LOOK at the p values and t vales in all of these versions. Are they the same?Q: Do the same t.test above but set paired to FALSE. Are they different? Why?Q: Do you understand why the one-sample test on differences works the same as the 2-sample paired test?Now let's unpack the information in t.pvalue = t$p.value # that's the p value. nuff said. Default is 2 sided.t$statistic # This is the t statistic. What does this mean? This is the test statistic. let's plot it on a distribution.## t ## -4.062128Q: What is the null distribution here?Recall: the t value is a TEST STATISTIC to test against your NULL SAMPLING distribution of t values. You can get a null sampling distribution from math (as Gossett from Guinness Brewery did back in the day - he found the equation for the t distriubution for any given df) OR from resampling (e.g., permutation testing).Q: What is the equation for calculating t?In words it is the difference in the means (or the mean and some fixed value (like zero)) divided by the standard error of the mean. For a one-sample t-test and paired t-test it's (sample_mean - 0)/(sd(sample)/sqrt(n)).Don't trust Cowen. Trust but verify... Use the equation to make your own calculation of t.diffs = sleep_wide$Sleepomogorazam - sleep_wide$Placebomy_t = mean(diffs)/(sd(diffs)/sqrt(length(sleep_wide$Sleepomogorazam)))Compare the above my_t to t$statistic. Why are the signs different?Let's calculate the df ourselves...deg_freedom = length(sleep_wide$Placebo)-1 # could have also gotten this from the output of the t.test.Plot the distribution.x = seq(-5,5,.01);qplot(x = x, y = dt(x,deg_freedom)) + geom_vline(xintercept=t$statistic)How much area is to the left of the line?pt(t$statistic,deg_freedom)## t ## 0.001416445pvalue## [1] 0.00283289Q: Tell the tail of why might the output of pt be smaller that pvalue?pt(t$statistic,deg_freedom)*2## t ## 0.0028328995% Confidence intervals.t$conf.int## [1] -2.4598858 -0.7001142## attr(,"conf.level")## [1] 0.95t$statistic## t ## -4.062128Q: What does this confidence interval tell you? Is it prospective or retrospective or Bayesian?Effect size.Let's compute the effect size using Cohen's d in 2 ways. Which one is correct and why?Cohen's d version 1:Cd1 = (mean(sleep_wide$Sleepomogorazam) - mean(sleep_wide$Placebo))/mean(c(sd(sleep_wide$Sleepomogorazam),sd(sleep_wide$Placebo)))diffs = sleep_wide$Sleepomogorazam - sleep_wide$PlaceboCd2 = mean(diffs)/sd(diffs)Q: Which one is better? Cd1 or Cd2Of course Cohen's d can also be calculated in R. The effsize library has a popular cohen.d function. NOTE: the effsize library also has Cliff's Delta which is an effect size measure for non-parametric distributions (like ranked data.).library("effsize")Cd1a = cohen.d(sleep_wide$Sleepomogorazam, sleep_wide$Placebo)Cd1b = cohen.d(sleep_wide$Sleepomogorazam, sleep_wide$Placebo,paired = TRUE)Check these with the ones we calculated by hand above. Close enough?Note: Cohen's d over estimates effect sizes for small sample sizes (say < 30). It's still used, but there is a correction: see AND WRITEUP:Average sleep time increased by 1.58 hours in patients receiving Sleepomogorazam relative to Placebo (Paired t-test, t(4.06), p = 0.003, d = 1.28). This is a strong indication that this drug is effective and merits further study.Assignment: (don't worry, it's simple.)Find a new dataset from the sets listed in data() or use your own data and repeat all of the above steps with this data. Save this as an R file and call it HW7.R and store it in your Box folder. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download