AP Statistics - Chapter 3 Summary Sheet



AP Statistics - Chapter 3 Summary Sheet

CSSO: Use this acronym when being asked to comment on the appearance of or to describe any distribution of single variable data. It stands for Center – Shape – Spread – Outliers.

FSDD: Use this acronym when being asked to comment on the appearance of or to describe any scatterplot of bivariate data. It stands for Form – Strength – Direction – Deviations.

When asked to explain the meaning of – or to interpret:

Slope (b) of the LSRL:

There is a(n) (numerical slope value increase/decrease) in the (Response variable named with the correct units) per a one unit increase in the (Explanatory Variable named with correct units).

Correlation Coefficient r:

There is a (very strong , strong, moderate, weak, very weak) (positive/negative) linear relationship between (state the name of the response variable in the context of the problem) and (state the name of the explanatory variable in the context of the problem) .

Coefficient of determination r2:

(Percentage arrived at on calculator) % of the VARIATION in the (state the name of the response variable in the context of the problem) can be explained by a linear relationship to the (state the name of the explanatory variable in the context of the problem).

(or)

(Percentage arrived at on calculator) % of the VARIATION in the (state the name of the response variable in the context of the problem) can be explained by the LSRL of (state the name of the response variable in the context of the problem) on (state the name of the explanatory variable in the context of the problem)

Facts on Correlation:

1. Correlation makes no distinction regarding which you call x and which you call y. Switch the axes, and the r – value remains the same.

2. Correlation requires Quantitative Variable only. (Ex: There is no such thing as a correlation between race and crime; or gender and IQ.)

3. Because r (based on standardized z-values) is itself, unit-less … changes units of measure will not alter the r value. (Ex: height in inches vs. age in months, r =.963. Repeat same data with height in cm vs. age in days, and the r still will - .963)

4. Positive r indicates a positive association (and positive sloped LSRL); Negative r indicates a negative association (and a negatively slope LSRL).

5. [pic]; Where r = 1 is a perfectly linear positive relationship; and r = -1 is a perfectly linear negative relationship. An r = 0 indicates the weakest linear association between variables. AS you move away from 0 to either extreme, this indicates a stronger relationship.

6. Correlation is ONLY applicable to measuring the strength of a LINEAR relationship.

7. Similar to the mean and standard deviation, r is very much so influenced by individual extreme values (regression outliers and influential points). That is to say … r is a “non-resistant measure.

Facts on the LSRL:

1. The distinction between the explanatory and response variable is essential, and therefore, will make a huge difference if their roles are switched. Unlike with r, when you switch the x and y roles, a new result in the form of a new LSRL will be formed.

2. There is a close relationship between the SLOPE of the LSRL and the correlation r. Specifically, [pic]. This also means that the higher r is, and then changes in x will be more noticeable in their impact on[pic]. When r is very low (and weak), then the impact is not as drastic. This is the low r, will in turn make a less steeped line (approaching horizontal), and when even big changes in x do NOT produce big responses in[pic].

3. The LSRL always contains the point[pic]. So the LSRL then can be described solely in terms of the descriptive statistics[pic]. Again form Fact 2:[pic]; and also [pic].

4. The square of the correlation r, i.e. [pic] (coefficient of determination) is the fraction (or percentage) of the variation on the values of y that is explained by the LSRL of y on x.

5. The final determination as to the validity and overall “correctness” and strength of a LSRL model rests upon a visual examination of the RESIDUAL PLOT. Even with exceptionally high values of r close to 1 or -1, the patterns that might appear in the residual plot will over-ride any r – value. (See pp. 170-171; p. 800 (good plots for good LSRL), and p. 801 (for 2 ways the residuals can show a bad LSRL).

What is left for you???

• Do random problems in 3.1, 3.2 and 3.3

• Can you read and interpret, and sift through extraneous information on computer generated reports (ex: see p. 156 figure 3.12; p. 189 problem # 3.76; p. 787 problems 14.2 and 14.3; p. 795 problem # 14.10 and 14.11

• 1.1 summary: pp.134 -135;

• 1.2 summary: pp.146 – 147;

• 1.3 summary: p.176;

• Chapter review summary: pp. 181- 183

• Review problems; pp. 183 – 190

• Practice Test A and B

• Examine solutions to the quizzes that you took – Solutions are posted in Class.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download