AMS572
AMS572.01 Midterm Exam Fall, 2005
Instructions: This is a close book exam. Anyone who cheats in the exam shall receive a grade of F. Please provide complete solutions for full credit. Good luck!
1. In a study of hypnotic suggestion, 10 male volunteers were randomly allocated to an experimental group and a control group. Each subject participated in a two-phase experimental session. In the first phase, respiration was measured while the subject was awake and at rest. In the second phase, the subject was told to imagine that he was performing muscular work, and respiration was measured again. For subjects in the experimental group, hypnosis was induced between the first and second phases; thus, the suggestion to imagine muscular work was “hypnotic suggestion” for experimental subjects and “waking suggestion” for control subjects. The accompanying table shows the measurements of total ventilation (liters of air per minute per square meter of body area) for all 10 subjects.
|Experimental Group |Control Group |
|Subject |Rest |Work |Subject |Rest |Work |
|1 |6 |6 |6 |6 |5 |
|2 |7 |9 |7 |5 |5 |
|3 |5 |8 |8 |5 |5 |
|4 |7 |12 |9 |6 |6 |
|5 |6 |7 |10 |5 |4 |
(a) Use suitable tests to investigate the differences between the responses of the experimental and control groups (Use α =.05. Please state the assumption(s) of the test.)
(b) Please write up the entire SAS program necessary to answer questions raised in (a). Please include the data step as well as tests for testing for various assumptions.
/*Problem #1*/
data one;
input ID group rest work;
diff=work-rest;
datalines;
1 1 6 6
2 1 7 9
3 1 5 8
4 1 7 12
5 1 6 7
6 2 6 5
7 2 5 5
8 2 5 5
9 2 6 6
10 2 5 4
;
run;
proc univariate data=one normal;
class group;
var diff;
title 'Check for normality and test for one population mean, Q1';
run;
proc ttest data=one;
class group;
var diff;
title 'Independent samples t-test, Q1';
run;
proc npar1way wilcoxon data=one;
class group;
var diff;
exact;
title 'Nonparametric test for two-mean comparisons, Q1';
run;
Check for normality and test for one population mean, Q1
15:49 Friday, October 28, 2005
The UNIVARIATE Procedure
Variable: diff
group = 1
Moments
N 5 Sum Weights 5
Mean 2.2 Sum Observations 11
Std Deviation 1.92353841 Variance 3.7
Skewness 0.59012866 Kurtosis -0.0219138
Uncorrected SS 39 Corrected SS 14.8
Coeff Variation 87.4335639 Std Error Mean 0.86023253
Basic Statistical Measures
Location Variability
Mean 2.200000 Std Deviation 1.92354
Median 2.000000 Variance 3.70000
Mode . Range 5.00000
Interquartile Range 2.00000
Tests for Location: Mu0=0
Test -Statistic- -----p Value------
Student's t t 2.557448 Pr > |t| 0.0628
Sign M 2 Pr >= |M| 0.1250
Signed Rank S 5 Pr >= |S| 0.1250
Tests for Normality
Test --Statistic--- -----p Value------
Shapiro-Wilk W 0.978716 Pr < W 0.9276
Kolmogorov-Smirnov D 0.141405 Pr > D >0.1500
Cramer-von Mises W-Sq 0.022452 Pr > W-Sq >0.2500
Anderson-Darling A-Sq 0.166107 Pr > A-Sq >0.2500
Quantiles (Definition 5)
Quantile Estimate
100% Max 5
99% 5
95% 5
90% 5
Check for normality and test for one population mean, Q1
15:49 Friday, October 28, 2005
The UNIVARIATE Procedure
Variable: diff
group = 1
Quantiles (Definition 5)
Quantile Estimate
75% Q3 3
50% Median 2
25% Q1 1
10% 0
5% 0
1% 0
0% Min 0
Extreme Observations
----Lowest---- ----Highest---
Value Obs Value Obs
0 1 0 1
1 5 1 5
2 2 2 2
3 3 3 3
5 4 5 4
Check for normality and test for one population mean, Q1
15:49 Friday, October 28, 2005
The UNIVARIATE Procedure
Variable: diff
group = 2
Moments
N 5 Sum Weights 5
Mean -0.4 Sum Observations -2
Std Deviation 0.54772256 Variance 0.3
Skewness -0.6085806 Kurtosis -3.3333333
Uncorrected SS 2 Corrected SS 1.2
Coeff Variation -136.93064 Std Error Mean 0.24494897
Basic Statistical Measures
Location Variability
Mean -0.40000 Std Deviation 0.54772
Median 0.00000 Variance 0.30000
Mode 0.00000 Range 1.00000
Interquartile Range 1.00000
Tests for Location: Mu0=0
Test -Statistic- -----p Value------
Student's t t -1.63299 Pr > |t| 0.1778
Sign M -1 Pr >= |M| 0.5000
Signed Rank S -1.5 Pr >= |S| 0.5000
Tests for Normality
Test --Statistic--- -----p Value------
Shapiro-Wilk W 0.684029 Pr < W 0.0065
Kolmogorov-Smirnov D 0.367396 Pr > D 0.0245
Cramer-von Mises W-Sq 0.138317 Pr > W-Sq 0.0229
Anderson-Darling A-Sq 0.799546 Pr > A-Sq 0.0140
(So in this case, the problem is better solved using the nonparametric method. Although in the exam, since you can not perform the normality test, it is fine to just do the parametric test. Wei)
Quantiles (Definition 5)
Quantile Estimate
100% Max 0
99% 0
95% 0
90% 0
Check for normality and test for one population mean, Q1
15:49 Friday, October 28, 2005
The UNIVARIATE Procedure
Variable: diff
group = 2
Quantiles (Definition 5)
Quantile Estimate
75% Q3 0
50% Median 0
25% Q1 -1
10% -1
5% -1
1% -1
0% Min -1
Extreme Observations
----Lowest---- ----Highest---
Value Obs Value Obs
-1 10 -1 6
-1 6 -1 10
0 9 0 7
0 8 0 8
0 7 0 9
Independent samples t-test, Q1
15:49 Friday, October 28, 2005
The TTEST Procedure
Statistics
Lower CL Upper CL Lower CL Upper CL
Variable group N Mean Mean Mean Std Dev Std Dev Std Dev Std Err
diff 5 -0.188 2.2 4.5884 1.1525 1.9235 5.5274 0.8602
1
diff 5 -1.08 -0.4 0.2801 0.3282 0.5477 1.5739 0.2449
2
diff Diff (1-2) 0.5374 2.6 4.6626 0.9552 1.4142 2.7093 0.8944
T-Tests
Variable Method Variances DF t Value Pr > |t|
diff Pooled Equal 8 2.91 0.0197
diff Satterthwaite Unequal 4.64 2.91 0.0366
(Note: In the exam, for the separate variance t-test, you can use df = min (df1-1, df2-1). For the given problem, df = min (5-1, 5-1) = 4 Wei)
Equality of Variances
Variable Method Num DF Den DF F Value Pr > F
diff Folded F 4 4 12.33 0.0321
Nonparametric test for two-mean comparisons, Q1
15:49 Friday, October 28, 2005
The NPAR1WAY Procedure
The NPAR1WAY Procedure
Wilcoxon Scores (Rank Sums) for Variable diff
Classified by Variable group
Sum of Expected Std Dev Mean
group N Scores Under H0 Under H0 Score
ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ
1 5 38.50 27.50 4.624812 7.70
2 5 16.50 27.50 4.624812 3.30
Average scores were used for ties.
Wilcoxon Two-Sample Test
Statistic (S) 38.5000
Normal Approximation
Z 2.2704
One-Sided Pr > Z 0.0116
Two-Sided Pr > |Z| 0.0232
t Approximation
One-Sided Pr > Z 0.0247
Two-Sided Pr > |Z| 0.0493
Exact Test
One-Sided Pr >= S 0.0159
Two-Sided Pr >= |S - Mean| 0.0317
Z includes a continuity correction of 0.5.
Kruskal-Wallis Test
Chi-Square 5.6571
DF 1
Pr > Chi-Square 0.0174
2. Thanksgiving was coming up and Harvey's Turkey Farm was doing a land-office business. Harvey sold 100 gobblers to Nedicks for their famous Turkey-dogs. Nedicks found that 90 of Harvey's turkeys were in reality peacocks.
(a) Estimate the proportion of peacocks at Harvey's Turkey Farm and find a 95% confidence interval for the true proportion of turkeys that Harvey owns.
(b) How large a random sample should we select from Harvey's Farm to guarantee the length of the 95% confidence interval to be no more than 0.06? (Note: please first derive the general formula for sample size calculation based on the length of the CI for inference on one population proportion, large sample situation. Please give the formula for the two cases: (i) we have an estimate of the proportion and (ii) we do not have an estimate of the proportion to be estimated. (iii) Finally, please plug in the numerical values and obtain the sample size for this particular problem.)
Solution:
(a) This is large sample CI for one population proportion.
We have α = 0.05, [pic]. The 95% CI is
[pic]=[pic]=[pic] or (0.04, 0.16)
(b) This is sample size calculation for the estimation of one population proportion.
The general formula is derived as follows (also refer to your lecture notes for the derivations of the pivotal quantity, and the CI etc.):
(i) The pivotal quantity for the inference on π is
[pic]. The 100(1-α)% symmetrical CI for π is derived from
[pic], which yields [pic]. Therefore the length of the CI is: [pic]. Solving for the sample size, we have [pic], where E = L/2 is referred to as the maximum error.
(Note: E is also directly defined as[pic])
(ii) When we have no estimate of the proportion, since simple calculus shows that [pic], a conservative estimate is [pic]
In the given problem, we have E = 0.03 and α = 0.05.
(i). [pic]. [pic]; (ii). [pic]
3. An expert witness in a paternity suit testifies that the length (in days) of pregnancy (that is, the time from impregnation to the delivery of the child) is approximately normally distributed with parameter [pic] and [pic]. The defendant in the suit is able to prove that he was out of the country during a period that began 290 days before the birth of the child and ended 240 days before the birth. If the defendant was, in fact, the father of the child, what is the probability that the mother could have had the very long or very short pregnancy indicated by the testimony?
Solution: let [pic]~[pic] and [pic]~[pic]
[pic](the woman had a very long or very short pregnancy)
[pic]
[pic]
4A (for AMS majors). Suppose we have two independent random samples from two normal populations: [pic], and [pic]. Please derive the pooled-variance t-test using the pivotal quantity method. Please make sure that you include the following key steps.
a) Please derive the distribution of [pic]
b) Please derive the distribution of [pic]
c) Please derive the distribution of the pooled-variance t statistic (the pivotal quantity).
d) Please derive the rejection region for a 2-sided test at the significance level of α.
e) Please illustrate using the pdf plot how to calculate the p-value for a 2-sided test.
Solution: (Please refer to your lecture notes for the entire derivation.) Here is a simple outline of the derivation.
a) We start with the point estimator for the parameter of interest[pic]: [pic]. Its distribution is [pic] using the mgf for [pic] which is [pic], and the independence properties of the random samples. From this we have [pic]. Unfortunately, Z can not serve as the pivotal quantity because σ is unknown.
b) We next look for a way to get rid of the unknown σ following a similar approach in the construction of the Student’s t-statistic. We found that [pic] using the mgf for [pic] which is [pic], and the independence properties of the random samples.
c) Then we found, from the theorem of sampling from the normal population, and the independence properties of the random samples, that Z and W are independent, and therefore, by the definition of the t-distribution, we have obtained our pivotal quantity: [pic], where [pic] is the pooled sample variance.
d) The rejection region is derived from [pic], thus [pic]
e) The p-value is twice the tail area bounded by the test statistic [pic] . I will not show the pdf plot here although you should.
4B (for non-AMS majors). A new method of making concrete blocks has been proposed. To test whether or not the new method increases the compressive strength, five sample blocks are made by each method. The compressive strengths in 10 pounds per square inch are listed here:
|New Method |15 14 13 15 16 |
|Old Method |13 15 13 12 14 |
(a). Please construct a 95 % percent confidence interval for the mean difference between the compressive strengths by two methods.
(b). At the significance level .05, can you conclude that the new method increases the compressive strength?
(c). What assumptions do you need for the inference in parts (a) and (b)?
(d). Please write up the entire SAS program necessary to answer questions raised in (a), (b) and (c). Please include the data step as well as tests for testing for various assumptions.
Solution: Inference on two population means. Two small and independent samples.
New method: [pic]
Old method: [pic]
Under the normality assumption, we test if the two population variances are equal [pic] vs [pic].
Test statistic is
[pic], [pic] and [pic].
Since F0 is between 0.1042 and 9.60, we cannot reject H0 . Therefore it is reasonable to assume that[pic].
a) 95% C. I. for difference is
[pic]
where [pic]
Therefore 95% C.I. is [-0.46, 2.86].
b) Using t-test with hypotheses [pic] v.s. [pic],
[pic]
[pic] Thus, we cannot reject H0. There isn't enough evidence to conclude that the new method increases the strength.
c) (1) Both populations are normally distributed
(2) [pic]
(d) /*Problem #4B*/
data four;
input method strength;
datalines;
1 15
1 14
1 13
1 15
1 16
2 13
2 15
2 13
2 12
2 14
;
run;
proc univariate data=four normal;
class method;
var strength;
title 'Check for normality';
run;
proc ttest data=four;
class method;
var strength;
title 'Independent samples t-test';
run;
proc npar1way data=four wilcoxon;
class method;
var strength;
title 'Nonparametric test for two-mean comparisons';
run;
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.