Two-Sample Hypothesis Testing: t-Tests

Conceptual Background, Assumptions, Paired versus Unpaired Data, Performing t-tests in RStudio, Interpreting and Communicating Results


Prepared by Allison Horst for ESM 206AB

Bren School of Environmental Science & Management (UCSB)

Two-sample t-tests are among the most common statistical analyses performed to compare ? as the name implies ? two samples comprised of continuous data satisfying parametric assumptions. What are you actually doing, though, when you perform a t-test to compare sample means?

Usually, you want to know if the samples are from the same population (i.e. are the means statistically equal, based on your confidence level?) or from different populations (i.e. are the means significantly different, based on your confidence level?). You perform a two-sample t-test to help you make that decision.

Conceptual Background

When you perform a two-sample t-test, you are actually testing whether the difference between the sample means is equal to zero. How does that work conceptually? Well, if two samples were exactly the same, the difference in means would be equal to zero and the distributions would be perfectly overlapping. For samples with identical means,

?1 ? ?2 = 0

The distributions for identical samples would be perfectly overlapping. However, what happens as the samples begin to differ? Then, the difference in means is not equal to zero, and you start to see a separation of means.

As that difference becomes greater, it becomes less and less likely that the two samples were taken from the same population. At a large enough difference (and also depending on the sample spread, of course), you may decide that it is unlikely that the two samples are from the same population and you could conclude that the samples are significantly different (or that the difference in means in significantly different from zero).

For two-sample t-tests, the null and alternative hypotheses are as follows:

H0 = Difference between sample means is equal to zero (i.e., sample means do not differ significantly) H1 = Difference between sample means zero (i.e., sample means are significantly different)

Mathematically, how is the test statistic calculated (i.e., what is calculated to help you determine whether the difference in means is sufficiently greater than zero to make you think the samples are from different populations)? Generally, the test statistic (t) is found as follows (with some variation based on equal sample variances, paired versus unpaired data, etc.):



!"! !!



!"! !!


Where ?1 and ?2 are the sample means, SD1 and SD2 are the corresponding sample standard deviations, and n1 and n2 are the corresponding sample sizes. Calculating the test statistic (t) allows you to find the p-value, or probability, of finding a difference that large or a more extreme difference given that the null hypothesis is true.

When to Use Two-Sample t-Tests

We use two-sample t-tests to determine whether two samples are likely or unlikely to have been selected from the same population (i.e., to decide whether two samples differ significantly based on the selected confidence level).

Two-sample t-tests are, at times, overused in place of other more appropriate statistical tests because they are very familiar to many researchers. However, be careful ? there are specific types of data for which t-tests are appropriate, and many for which they aren't.

Two-sample t-tests are appropriate when:

? You are only comparing TWO samples (if > 2, consider ANOVA or other multi-sample test to avoid increased likelihood of Type I error occurring)

? Samples are from normally distributed populations (see document re: tests for normality) ? Samples are randomly selected and independent ? Sample data are at least interval or ratio level data (differences between values are meaningful)

To perform two-sample t-tests, your samples do not need to:

? Have equal variances (Welch approximation) ? Have equal sample size ? Have a minimum sample size, so long as the assumptions hold (this is one of the greatest things about a


Paired versus Unpaired Data

When performing two-sample t-tests, it is important to know whether your data is paired or unpaired.

In paired data, each data point in one sample is associated with (or related to) a single data point in the second sample. The data sets cannot be analyzed as completely independent data sets, because of the point-to-point associations between the sample data.

Examples of experiments with paired data are:

? Sample 1 taken at Time 0 days for 20 people (before treatment) to measure cholesterol, and Sample 2 taken at Time 30 days for the same 20 people (following treatment)

? A experiment with 30 sets of twins (one male, one female) to investigate sex-dependent brain development

Alternatively, unpaired data are data from samples where there is no relation or association between a data point in one sample set and any data point in another sample set. Examples of experiments with unpaired data


? A comparison of blood sugar levels in diabetic patients versus non-diabetic patients ? An experiment in which 15 petri-plates of bacteria are treated with chlorine, and 15 are treated with


? Measuring zinc oxide concentrations in mussels from a shoreline in California and from a shoreline in Oregon

Performing Two-Sample t-Tests in RStudio

Lucky for us, RStudio is happy to do almost all of the work for us when we want to perform a two-sample t-test. The most important thing is to understand what type of data you're working with, and to organize it in a way that is easy to deal with in your RStudio workspace.

To perform t-tests in RStudio, you will be using the t.test() function. Let's explore the various arguments in the function.

> t.test(Sample1, Sample2, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, ...)

? The first two arguments (Sample1, Sample2) are the samples you are comparing, which should be stored (usually in columns) in a loaded data frame in your workspace

? The third argument is where you tell R whether you are performing a two-sided or one-sided t-test. What does that mean? If you are just asking the question "Do the samples differ significantly?" you are performing a two-sided test because there is no implied directionality. However, in some cases you may ask something like "Is sample A greater than sample B?" In that case, there IS directionality implied. See examples below.

? The fourth argument ("mu = 0") is usually ignored if performing a two-sample test ? the means difference is assumed to be 0, so leave it alone for a two-sample test. The only time you would likely change this default setting is if you are comparing one sample to a claim (a one-sample t-test), in which case you would input the claim value in that argument instead

? Is your data paired? The default setting for the t.test() function is that data are UNPAIRED ? this can make a huge difference in the outcome of your analysis! Make sure that if your data is PAIRED, you change this argument to `paired = TRUE'

? Do you know that sample variances are equal (did you perform an F-test to check)? If not, you should leave this argument as the default setting (var.equal = FALSE). This applies a Welch approximation to adjust for unequal variances

? The default confidence level is 0.95, which means your significance level () is 0.05. While this is the most common confidence level, you can change it

Let's look at some examples. For each of these, we will assume that the assumptions (ha) are satisfied ? but you should always convince yourself that the assumptions are satisfied before jumping into a t-test.

Example 1. Two-Sample t-Test for Unpaired Data (Two-Tailed)

For a first example, let's consider water samples taken from two different lakes ? one that is considered pristine (called `Control') and one that is heavily impacted by agricultural runoff (called `Runoff'). Create two vectors in RStudio containing the sample values for nitrate concentrations (mg/L) measured in the two lakes as follows:

Note that you could also have your data stored in a data frame, and instead reference the appropriate columns. Now we have data for our two samples stored in RStudio, and we ask the following question:

Do nitrate concentrations in Control and Runoff lakes differ significantly? Notice that there is NO implied directionality implied in the question ? we are not asking whether the Runoff concentrations are significantly greater that the Control concentrations, only whether they are different. When the question is asked as above, a two-tailed t-test is appropriate. Also, there is no reason to think that the data is paired ? one sample from the Control lake should not be associated with, or influence, one sample from the Runoff lake. Thus, we'll retain the default `paired = FALSE' argument. If you do not include an optional argument in the function, the default is automatically retained. A two-sided two-sample t-test, not assuming equal variances, for unpaired data can be input as follows:

Which yields the following report:

That's a lot of information. What does it all mean? ? The first line (`Welch Two Sample t-test') tells you what you did ? a two-sample t-test with a correction for unequal variances ? The second line (`data') tells you which two samples you are comparing

? The third line provides quantitative values for the test statistic (t), the degrees of freedom (df), and the resulting p-value. Most importantly, you'll need to be able to make sense of the p-value. Here, the pvalue of 0.0026 means that if the null hypothesis is true (the difference in means = 0), there is a 0.25% chance of collecting two samples with a means difference that you have between your samples by random chance. That's a pretty small likelihood! Since it's less than 5%, you would reject the null hypothesis and conclude that the means difference is not equal to 0 (i.e., you would conclude that the samples are significantly different)

? The alternative hypothesis reminds you of what your hypotheses are

? The 95 percent confidence interval is for the means differences (notice that the difference is calculated for `Control ? Runoff' ? hence the negative values

? The `sample estimates' show the actual sample means

Example 2. Two-Sample t-Tests for Paired Data (One-Tailed)

In Example 1, we used a two-sided t-test to compared unpaired sample data. The approach (in R) is similar for paired data. Here, we will perform a one-sided t-test for paired data.

A drug company is performing a clinical trial of a blood pressure medicine. Systolic blood pressures of 15 patients are measured prior to taking the drug, and again after 6 months on the drug. The data (loaded into R) appear as follows:

Patient 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

SBPBefore 141 163 122 138 112 119 124 137 142 133 128 115 139 140 126

SBPAfter 121 158 124 130 110 121 120 131 138 134 120 111 135 133 124

Where `Patient' column is the patient number, `SBPBefore' is the systolic blood pressure (mm Hg) before drug administration, and `SBPAfter' is the systolic blood pressure (mm Hg) after 6 months of drug treatment. The entire data frame is saved as `SBP' in the RStudio workspace.

A researcher hypothesizes that the blood pressure drug REDUCES blood pressure after 6 months of treatment (i.e., the SBPAfter is lower than the SBPBefore). Here, convince yourself that a two-sample t-test for PAIRED data is appropriate.


