Statistics 502 Lecture Notes - Duke University

Statistics 502 Lecture Notes

Peter D. Hoff

c December 9, 2009

Contents

1 Principles of experimental design

1

1.1 Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Model of a process or system . . . . . . . . . . . . . . . . . . . 2

1.3 Experiments and observational studies . . . . . . . . . . . . . 2

1.4 Steps in designing an experiment . . . . . . . . . . . . . . . . 6

2 Test statistics and randomization distributions

9

2.1 Summaries of sample populations . . . . . . . . . . . . . . . . 10

2.2 Hypothesis testing via randomization . . . . . . . . . . . . . . 13

2.3 Essential nature of a hypothesis test . . . . . . . . . . . . . . 17

2.4 Sensitivity to the alternative hypothesis . . . . . . . . . . . . . 18

2.5 Basic decision theory . . . . . . . . . . . . . . . . . . . . . . . 23

3 Tests based on population models

25

3.1 Relating samples to populations . . . . . . . . . . . . . . . . . 25

3.2 The normal distribution . . . . . . . . . . . . . . . . . . . . . 29

3.3 Introduction to the t-test . . . . . . . . . . . . . . . . . . . . . 30

3.4 Two sample tests . . . . . . . . . . . . . . . . . . . . . . . . . 36

3.5 Checking assumptions . . . . . . . . . . . . . . . . . . . . . . 42

3.5.1 Checking normality . . . . . . . . . . . . . . . . . . . . 43

3.5.2 Unequal variances . . . . . . . . . . . . . . . . . . . . . 43

4 Confidence intervals and power

47

4.1 Confidence intervals via hypothesis tests . . . . . . . . . . . . 47

4.2 Power and Sample Size Determination . . . . . . . . . . . . . 49

4.2.1 The non-central t-distribution . . . . . . . . . . . . . . 52

4.2.2 Computing the Power of a test . . . . . . . . . . . . . 54

i

CONTENTS

ii

5 Introduction to ANOVA

60

5.1 A model for treatment variation . . . . . . . . . . . . . . . . . 62

5.1.1 Model Fitting . . . . . . . . . . . . . . . . . . . . . . . 63

5.1.2 Testing hypothesis with MSE and MST . . . . . . . . . 66

5.2 Partitioning sums of squares . . . . . . . . . . . . . . . . . . . 70

5.2.1 The ANOVA table . . . . . . . . . . . . . . . . . . . . 72

5.2.2 Understanding Degrees of Freedom: . . . . . . . . . . . 73

5.2.3 More sums of squares geometry . . . . . . . . . . . . . 76

5.3 Unbalanced Designs . . . . . . . . . . . . . . . . . . . . . . . . 78

5.3.1 Sums of squares and degrees of freedom . . . . . . . . . 79

5.3.2 ANOVA table for unbalanced data: . . . . . . . . . . . 81

5.4 Normal sampling theory for ANOVA . . . . . . . . . . . . . . 83

5.4.1 Sampling distribution of the F -statistic . . . . . . . . . 85

5.4.2 Comparing group means . . . . . . . . . . . . . . . . . 88

5.4.3 Power calculations for the F-test . . . . . . . . . . . . 90

5.5 Model diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . 92

5.5.1 Detecting violations with residuals . . . . . . . . . . . 93

5.5.2 Checking normality assumptions: . . . . . . . . . . . . 94

5.5.3 Checking variance assumptions . . . . . . . . . . . . . 96

5.5.4 Variance stabilizing transformations . . . . . . . . . . . 100

5.6 Treatment Comparisons . . . . . . . . . . . . . . . . . . . . . 106

5.6.1 Contrasts . . . . . . . . . . . . . . . . . . . . . . . . . 107

5.6.2 Orthogonal Contrasts . . . . . . . . . . . . . . . . . . . 110

5.6.3 Multiple Comparisons . . . . . . . . . . . . . . . . . . 112

5.6.4 False Discovery Rate procedures . . . . . . . . . . . . . 115

5.6.5 Nonparametric tests . . . . . . . . . . . . . . . . . . . 115

6 Factorial Designs

116

6.1 Data analysis: . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

6.2 Additive effects model . . . . . . . . . . . . . . . . . . . . . . 123

6.3 Evaluating additivity: . . . . . . . . . . . . . . . . . . . . . . . 126

6.4 Inference for additive treatment effects . . . . . . . . . . . . . 130

6.5 Randomized complete block designs . . . . . . . . . . . . . . . 140

6.6 Unbalanced designs . . . . . . . . . . . . . . . . . . . . . . . . 146

6.7 Non-orthogonal sums of squares: . . . . . . . . . . . . . . . . . 153

6.8 Analysis of covariance . . . . . . . . . . . . . . . . . . . . . . 155

6.9 Types of sums of squares . . . . . . . . . . . . . . . . . . . . . 159

CONTENTS

iii

7 Nested Designs

163

7.1 Mixed-effects approach . . . . . . . . . . . . . . . . . . . . . . 171

7.2 Repeated measures analysis . . . . . . . . . . . . . . . . . . . 174

List of Figures

1.1 Model of a variable process . . . . . . . . . . . . . . . . . . . . 2

2.1 Wheat yield distributions . . . . . . . . . . . . . . . . . . . . . 12 2.2 Approximate randomization distribution for the wheat example 16 2.3 Histograms and empirical CDFs of the first two hypothetical

samples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.4 Randomization distributions for the t and KS statistics for

the first example. . . . . . . . . . . . . . . . . . . . . . . . . 21 2.5 Histograms and empirical CDFs of the second two hypotheti-

cal samples. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.6 Randomization distributions for the t and KS statistics for

the second example. . . . . . . . . . . . . . . . . . . . . . . . 22

3.1 The population model . . . . . . . . . . . . . . . . . . . . . . 27 3.2 2 distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 33 3.3 t-distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.4 The t-distribution under H0 for the wheat example . . . . . . 39 3.5 Randomization and t-distributions for the t-statistic under H0 40 3.6 Normal scores plots. . . . . . . . . . . . . . . . . . . . . . . . 44

4.1 A t10 distribution and two non-central t10-distributions. . . . . 52 4.2 Critical regions and the non-central t-distribution . . . . . . . 55 4.3 and power versus sample size, and the normal approximation

to the power. . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.4 Null and alternative distributions for another wheat example,

and power versus sample size. . . . . . . . . . . . . . . . . . . 59

5.1 Response time data . . . . . . . . . . . . . . . . . . . . . . . . 61 5.2 Randomization distribution of the F -statistic . . . . . . . . . . 70

iv

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download