Statistics AP/GT
Math 3339
Homework 7 (Chapters 9, 11 & 12) Name:__________________________________ PeopleSoft ID:_______________
Instructions: ? Homework will NOT be accepted through email or in person. Homework must be submitted through CourseWare BEFORE the deadline. ? Print out this file and complete the problems or you can complete it using your computer. ? Use blue or black ink or a dark pencil if completing this by hand. ? Write your solutions in the space provided. You must show all work for full credit. ? Submit this assignment at under "Assignments" and choose HW7. ? Total possible points: 15.
1. *The following data is looking at how long it takes to get to work. Let x = commuting distance (miles) and y = commuting time (minutes) x 15 16 17 18 19 20 y 42 35 45 42 49 46 a. Give a scatterplot of this data and comment on the direction, form and strength of this relationship. b. Determine the least-squares estimate equation for this data set. c. Give the r2, comment on what that means. d. Give the residual plot based on the least-squares estimate equation. e. Test if this least-squares estimate equation specify a useful relationship between commuting distance and commuting time.
a. This is a positive relationship, somewhat strong, somewhat linear.
b. Output from R Studio
Call:
lm(formula = y ~ x)
Residuals:
1
2
3
4
5
6
3.048 -5.638 2.676 -2.010 3.305 -1.381
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 13.6667 16.9572 0.806 0.465
x
1.6857
0.9644 1.748 0.155
Residual standard error: 4.034 on 4 degrees of freedom Multiple R-squared: 0.433, Adjusted R-squared: 0.2913 F-statistic: 3.055 on 1 and 4 DF, p-value: 0.1554
Equation: = 13.6667 + 1.6857 c. R2 = 0.433; About 43.3% of the variation in the time can be explained by this equation.
d. e. H0: 1 = 0 and Ha: 1 0, test statistic = t = 0.9644, p-value = 0.155, Fail to reject the null
hypothesis. There is no evidence that there is a relationship between the commuting time and the distance. Using this data.
Problems came from Devore, Jay and Berk, Kenneth, Modern Mathematical Statistics with Applications, Thomson Brooks/Cole, 2007.
2. *The following another set of data that looking at how long it takes to get to work. Let x = commuting distance (miles) and y = commuting time (minutes)
x 5 10 15 20 25 50 y 16 32 44 45 63 115
a. Give a scatterplot of this data and comment on the direction, form and strength of this relationship. b. Determine the least-squares estimate equation for this data set. c. Give the r2, comment on what that means. d. Give the residual plot based on the least-squares estimate equation. e. Test if this least-squares estimate equation specify a useful relationship between commuting
distance and commuting time. f. Compare this least-square estimate equation to the previous least-squares estimate equation in
problem 1. In which situation would the least-squares equation be least effective? Justify your answer.
a. This is a positive strong, linear relationship
b. Output from R studio
Call:
lm(formula = y ~ x)
Residuals:
1
2
3
4
5
6
-2.58033 2.70820 3.99672 -5.71475 1.57377 0.01639
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.8689
2.8760 2.736 0.0521 .
x
2.1423
0.1132 18.930 4.59e-05 ***
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Residual standard error: 4.034 on 4 degrees of freedom Multiple R-squared: 0.989, Adjusted R-squared: 0.9862 F-statistic: 358.4 on 1 and 4 DF, p-value: 4.587e-05
Equation: = 7.8689 + 2.1423 c. R2 = 0.989, About 98.9% of the variation in time can be explained by this equation.
Problems came from Devore, Jay and Berk, Kenneth, Modern Mathematical Statistics with Applications, Thomson Brooks/Cole, 2007.
d. Residual plot
e. H0: 1 = 0 and Ha: 1 0, test statistic = 18.930, p-value = 0.0000459, Reject the null hypothesis. There is very strong evidence of a relationship between distance and time using this data.
f. Problem 1 would be least effective because the R2 is smaller and there appears to not be a relationship between distance and time. Where as in problem 2, distance seems to be significant to determine time.
Problems came from Devore, Jay and Berk, Kenneth, Modern Mathematical Statistics with Applications, Thomson Brooks/Cole, 2007.
3. The cost of a home depends on the number of bedrooms in the house. Suppose the following data is
recorded for homes in a given town
price (in thousands) 300 250 400 550 317 389 425 289 389 559
No. bedrooms
3 3
4
5
4
3
6
3
4
5
a) Make a scatterplot b) Fit the data with a least squares regression line. c) Give a 95% confidence interval for the slope. d) If one house has one more number of rooms than another house, how much additional cost would
we expect for the price? e) Test the hypothesis that an extra bedroom costs $60,000 against the alternative that it costs more.
a. Scatterplot
b. R studio output
Call:
lm(formula = price ~ beds)
Residuals:
Min
1Q Median
-108.00 -53.95 -5.75
3Q 59.77
Max 99.10
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 94.40
97.98 0.963 0.3635
beds
73.10
23.76 3.076 0.0152 *
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Residual standard error: 75.15 on 8 degrees of freedom Multiple R-squared: 0.5419, Adjusted R-squared: 0.4846 F-statistic: 9.462 on 1 and 8 DF, p-value: 0.01521
Equation: = 94.40 + 73.10 ?
c. Confidence interval:
> 73.1+c(-1,1)*qt(1.95/2,8)*23.76 [1] 18.30934 127.89066
Problems came from Devore, Jay and Berk, Kenneth, Modern Mathematical Statistics with Applications, Thomson Brooks/Cole, 2007.
d. This is the definition of the slope. So the price will increase by $73,100 for each additional
bedroom added.
e. H0: 1 = 60 and Ha: 1 > 60
t
=
73.1-60 23.76
=
0.5513
p-value = 1 ? pt(0.5513,8) = 0.2982, Fail to reject the null hypothesis.
There is no evidence that the slope is greater than 60.
Problems came from Devore, Jay and Berk, Kenneth, Modern Mathematical Statistics with Applications, Thomson Brooks/Cole, 2007.
4. Section 11.1.4, problem 2 The table below shows summary statistics for normally distributed measurements on 5 groups. The population variances are all equal. Construct an anova table and determine if there is a difference in the population means by calculating a p-value.
N Mean var grp1 10 52.40 243.38 grp2 21 55.00 142.00 grp3 16 36.25 246.73 grp4 20 53.65 173.82 grp5 18 47.50 267.91
..
=
1052.4+2155+1636.25+2053.65+1847.5 10+21+16+20+18
=
49.25882
SSTr = 10(52.4 ? 49.25882)2 + 21(55 ? 49.2588)2 + 16(36.25 ? 49.25882)2
+ 20(53.65 ? 49.25882)2 + 18(47.5 ? 49.25882)2 = 3939.856
SSE = 9(243.38) + 20(142) + 15(246.73) + 19(173.82) + 17(267.91) = 16588.42
ANOVA TABLE
DF
Treatment
4
Error
80
Total
85
SS 3939.856 16588.42 20528.27588
MS 984.964 207.3553
F 4.750128
p-value = 1 ? pf(4.750128,4,80) = 0.0017, Reject the null hypothesis. At least one of the means is different.
Problems came from Devore, Jay and Berk, Kenneth, Modern Mathematical Statistics with Applications, Thomson Brooks/Cole, 2007.
5. A study was conducted to examine the effect of pets in stressful situations. Fifteen subjects were randomly
assigned to each of three groups to do a stressful task alone (the control group), with a good friend present, or
with their dog present. The subject's mean heart rate (in beats per minutes) during the task is one measure of
the effect of stress. The data has is the mean heart rates during stress with a pet (P), with a friend (F) and for
the control group (C).
Control Friend
Pet
80.369 99.692 69.169
87.446
83.4 70.169
90.015 102.154 75.985
99.046 80.277 86.446
75.477 88.015 68.862
87.231 92.492 64.169
91.754 91.354 97.538
87.785 100.877
85
77.8 101.062 72.262
62.646
81.6 58.692
84.738 89.815 79.662
84.877
98.2 69.231
73.277 76.908 69.538
84.523 86.985 70.077
70.877 97.046 65.446
This data is in the homework and calendar website called "Stress" .
a. Make a side by side box plot of the heart rates by the three groups. To do this in R use: boxplot(Rate~Group,data=Stress)
Does there seem to be a difference in the heart rates of the three groups? Do any of the groups show outliers or extreme skewness?
b. We want to test if there is a difference in the mean heart rates for the three groups. Give the null hypothesis of this test.
c. Does the data suggest that there is a difference among the three groups? Use = 0.05. d. If there seems to be a difference, complete a Bonferroni pairwise test to determine which or if all the
means are different from each other.
Problems came from Devore, Jay and Berk, Kenneth, Modern Mathematical Statistics with Applications, Thomson Brooks/Cole, 2007.
................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- s p 500 equal weight vs cap weight sectors
- chapter 8 hypothesis testing
- find p values with the ti83 ti84 san diego mesa college
- statistics ap gt
- two sample t tests independent samples pooled standard
- multiple comparisons method test for equal variances
- hypothesis testing
- z tests and p values testing hypotheses σ is known and n
- hypothesis testing for population mean
- p values and formal statistical tests cornell university
Related searches
- ap statistics textbook online pdf
- the practice of statistics ap 5e answers
- ap statistics textbook answers
- practice of statistics ap edition
- ap statistics 5th edition
- ap statistics reference table
- ap statistics course
- ap statistics frq
- ap statistics exam
- ap statistics khan academy
- khan academy ap statistics review
- ap statistics notes pdf