ST 361 Normal Distribution



ST361: Ch3 Bivariate Data Analysis

Overview/Review of Bivariate Data Analysis

• Bivariate data: 2 variables X and Y involved. Usually denote

← X as ___________________ variable, aka _______________ variable

← Y as ___________________ variable, aka _______________ variable

• Bivariate data analysis: interested in the relationship between X and Y

• Overview of bivariate data analysis:

| |Example 1 |Example 2 |Example 3 |

| |Obs |Obs |Obs |

| |Battery Brand |Study Time (hr) |Drug Type |

| |Lifetime (hr) |Exam Score |Side Effect |

| | | | |

| |1 |1 |1 |

| |Duracell |6.5 |New |

| |4.2 |76 |Y |

| | | | |

| |2 |2 |2 |

| |Eveready |7.3 |New |

| |5.1 |83 |Y |

| | | | |

| |3 |3 |3 |

| |Eveready |9.5 |New |

| |3.9 |92 |N |

| | | | |

| |[pic] |4 |[pic] |

| | |7.1 | |

| | |87 | |

| | | | |

| |N |5 |N |

| |Duracell |8.4 |Old |

| |3.8 |93 |N |

| | | | |

| | |6 | |

| | |8.0 | |

| | |88 | |

| | | | |

|Question of interest |Do the two brands have the same |Is exam score related to study time? |Can new drug reduce side effect? |

| |lifetime? | | |

|Independent variable X | | | |

|Dependent variable Y | | | |

|Type of Variables |X: |X: |X: categorical |

| |Y: |Y: |Y: categorical |

|Graphical Presentation | Side-by-side Boxplot |Scatter plot | Bar plot |

| |[pic] |[pic] |[pic] |

| | | | |

| | | | |

| | | | |

| | | | |

|Numerical Summary | | | |

| | | | |

| | | | |

| | | | |

| | | | |

|Statistical Inference |[pic] |Population regression line |[pic] |

| | |[pic] | |

| | |[pic] | |

| | |Population Correlation coefficient | |

| | |[pic] | |

| | |[pic] | |

------------------------------------------------------------------------------------------------------------------

ST361: Ch3.2 Correlation Coefficient

Topics:

a) Definition

b) Interpretation

c) Calculation

------------------------------------------------------------------------------------------------------------------

a) Definition: The sample correlation coefficient r is a statistic that quantifies the ______________ and ______________ of the _____________________________ between 2 continuous variables X and Y

• The ____________ of r indicates the strength of the relationship:

The correlation coefficient r takes values in the range of _____________________

• The ____________ of r indicates the direction of the relationship between X and Y:

Ex.

[pic]

b) Interpretation:

(1) 0.8 < | r | < 1 : _______________ relationship between X and Y

(2) 0.5 < | r | [pic] 0.8 : _______________ relationship

(3) 0.0 < | r | [pic] 0.5 : _______________ relationship

Comments:

• The value of r ____________________________________________________________

• Meaning of r = 0:

[pic]

c) Calculation

Q: if the definitions of X and Y are swapped, will the value of r change?

|Ex. Study time vs. Exam score |[pic] |

| | |

|X | |

|Y | |

|[pic] | |

| | |

|Obs | |

|Study Time (hr) | |

|Exam Score | |

| | |

| | |

|1 | |

|6.5 | |

|76 | |

|494 | |

| | |

|2 | |

|7.3 | |

|83 | |

|605.9 | |

| | |

|3 | |

|9.5 | |

|92 | |

|874 | |

| | |

|4 | |

|7.1 | |

|87 | |

|617.7 | |

| | |

|5 | |

|8.4 | |

|93 | |

|781.2 | |

| | |

|6 | |

|8.0 | |

|88 | |

|704.0 | |

| | |

| | |

| | |

|[pic] | |

|[pic] | |

|[pic]=4076.8 | |

|Calculate the sample correlation coefficient r. | |

Ex. Body Mass Index vs. Blood Pressure

| |X |Y |[pic] |

|Obs |Body Mass Index |Systolic Blood Pressure | |

|1 |18 |120 |2160 |

|2 |20 |110 |2200 |

|3 |22 |120 |2640 |

|4 |25 |135 |3375 |

|5 |26 |140 |3640 |

|6 |29 |115 |3335 |

|7 |30 |150 |4500 |

|8 |33 |165 |5445 |

|9 |33 |160 |5280 |

|10 |35 |180 |6300 |

[pic],[pic], [pic]=38875

Calculate the sample correlation coefficient r.

[pic]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download