Statistical Analysis - 3

[Pages:8]Statistical Analysis - 3

PEARSON R CORRELATION COEFFICIENT

Introduction:

Sometimes in scientific data, it appears that two variables are connected in such a way that when one variable changes, the other variable changes also. This connection is called a correlation. Examples of this type of correlation include: (1) in deer populations, large males seem to have more successful matings; and (2) larger numbers of birds seem to nest in areas with dense vegetation.

Scientists measure the strength of a relationship between two variables by calculating a correlation coefficient. The value of the correlation coefficient indicates to what extent the change found in one variable relates to change in another. There are several types of correlation coefficients, but the one that is most widely used is called the Pearson Product-Moment Correlation Coefficient, or simply, the Pearson r.

Student Procedure

Example: Your students have done some classroom research on amphibian species found in your area and have discovered that the red-backed salamander uses fallen logs and debris on the forest floor for their home. During their earlier census of their Biodiversity Plot, they have noticed that some quadrats have many fallen logs whereas other quadrats have few or none. They expect that they would find more red-backed salamanders in those quadrats with many fallen logs and design an experiment to test this hypothesis. This experiment measures: (1) the number of fallen logs in each quadrat; and (2) the number of red-backed salamanders in each quadrat. This is a table of the data your class has collected:

SA 3.1

Statistical Analysis - 3

Q uadrat N um ber

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

# F a llen L o g s

4 6 2 3 7 1 0 2 0 3 2 4 0 2 2 1 3 5 3 2 2 0 0 2 1

# S a la m a n d ers

3 2 0 1 3 0 0 0 0 1 1 2 0 1 0 0 1 3 1 2 1 0 0 0 0

Step 1:

Graphing a the data

Graph your data by computer, or by hand, by assigning the number of fallen logs (the second column) as the X-axis and the number of salamanders (the last column) as the Y-axis. For example, in Quadrat 1, the X-value would be 4 and the Y-value would be 3. The results, when we plot all 25 points on the graph, look like this:

RELATIONSHIP OF FALLEN LOGS AND SALAMANDERS

3.5

3

Number of Salamanders

2.5

2

1.5

1

0.5

0

0

1

2

3

4

5

6

7

8

Number of Fallen Logs

SA 3.2

Statistical Analysis - 3

Looking at this graph, there seems to be a positive relationship between the number of fallen logs and the number of salamanders. In other words, it appears that when the number of fallen logs increases, the number of Red-Backed Salamanders also increases.

Some things to remember about the Pearson r correlation:

? The lowest value that the Pearson r can have is r = 0.00. This means there is ZERO correlation, and would indicate that X and Y are not related to one another.

? The highest value that the Pearson r can have is r = 1.00. This indicates a PERFECT correlation and would indicate that X and Y are completely related to one another in the sample.

? Pearson r values can be either positive or negative. A positive value indicates that increases in X correspond to increases in Y. A negative value indicates that increases in one variable are associated with decreases in the other variable.

The following graphs illustrate some of the various types of correlations possible:

a. This is an example of a perfect, positive correlation, in that the data shows no deviation from a straight line.

b. This is an example of a perfect, negative correlation, in that the data shows no deviation from a straight line.

c. This is an example of a high, positive correlation. Since the data shows some variability, a perfect prediction cannot be made.

d. This is an example of a high, positive correlation. Since the data shows some variability, a perfect prediction cannot be made.

e. This shows a low correlation. Although predictions could be made, and those predictions would be slightly better than chance, estimates would still be imprecise.

f. This figure shows a zero correlation. Prediction would be no better than chance.

SA 3.3

Statistical Analysis - 3

Step 3:

Calculating the Pearson r Correlation Coefficient

The graph below was produced by Microsoft Excel (charting function) which calculated a correlation coefficient from the data in our example. The graph shows a trend indicating an increase in salamanders where there are more fallen logs present. Note, however, that the value calculated by this program is the Pearson r value squared. You must take the square root of this figure to give the Pearson r value. From the graph: R2 = 0.72; Pearson r = 0.85. Because 0.85 is close to 1.0 (the maximum value for the Pearson r), this demonstrates a strong, positive correlation.

RELATIONSHIP OF FALLEN LOGS AND SALAMANDERS

3.5 R2 = 0.7175

Pearson r = 0.85 3

Number of Salamanders

2.5

2

1.5

1

0.5

0

0

1

2

3

4

5

6

7

8

-0.5

Number of Fallen Logs

If not using the Excel Software, or other graphing program, you can calculate the Pearson r by using the following formula:

FORMULA FOR CALCULATING THE PEARSON R CORRELATION COEFFICIENT

[ ][ ] Pearson r =

N(XY)- (X)( Y)

N(

X

2

)-

(

)2

X

N(

Y2

)-

(

)2

Y

SA 3.4

Statistical Analysis - 3

This formula looks complicated, but can be simplified by breaking it into its separate components. Using your original data, create the following table:

Quadrat Number

(N) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

N = 25

# Fallen Logs (X) 4 6 2 3 7 1 0 2 0 3 2 4 0 2 2 1 3 5 3 2 2 0 0 2 1

X = 25

# Salamanders (Y)

3 2 0 1 3 0 0 0 0 1 1 2 0 1 0 0 1 3 1 2 1 0 0 0 0 Y = 25

X2

16 36 4 9 49 1 0 4 0 9 4 16 0 4 4 1 9 25 9 4 4 0 0 4 1 X2 = 25

Y2

9 4 0 1 9 0 0 0 0 1 1 4 0 1 0 0 1 9 1 4 1 0 0 0 0 Y2 = 25

XY

12 12 0 3 21 0 0 0 0 3 2 8 0 2 0 0 3 15 3 4 2 0 0 0 0 XY = 25

Using the values from the new table, complete the Pearson r formula:

[ ][ ] Pearson r =

N(XY)- (X)( Y)

N(

X

2

)-

(

)2

X

N(

Y2

)-

(

)2

Y

SA 3.5

Statistical Analysis - 3

The numerator, or top of the formula, looks like this once we plug in all the numbers:

[ ][ ] Pearson r =

(25)(90) - (57)(22)

25(213) - (57)2 25(46) - (22)2

Pearson r =

2250 - 1254

[5325 - 3249][1150 - 484]

Pearson r =

996

[2076][666]

Pearson r = 996 1382616

Pearson r = 996 1175.85

Pearson r = 0.8471

Again, this Pearson r correlation coefficient (being extremely close to the maximum value 1.0) demonstrates a strong positive correlation between the number of fallen logs and the number of salamanders.

SA 3.6

Statistical Analysis - 3

Step 4:

Determine if your calculations have statistical significance

You must determine whether or not your calculations have statistical significance. To do this you must determine the `critical value' for your Pearson r correllation coefficient by using the following table:

So for our example: 1. Calculate the degrees of freedom (DF) by subtracting the 2 from

the number of comparisons you are making (DF = N - 2) In our case, we are sampling fallen logs and salamanders in 25 quadrats (N = 25) DF = 25 - 2 = 23

2. Find your DF on the table below and find the critical value allowed. In our case, the nearest DF listed is 25 with a critical value of 0.3233. Our calculated Pearson r correlation coefficient is 0.8471.

3. The calculated figure is greater than the critical value from the table; our findings have statistical significance. Therefore, we can assume that our hypothesis is true and that there is a strong positive correlation betwen the number of fallen logs and the number of salamanders and this correlation is not due to chance.

SA 3.7

Statistical Analysis - 3

CRITICAL VALUES FOR THE PEARSON R CORRELLATION COEFFICIENT

DF (N - 2)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 25 30 35 40 45 50 60 70 80 90 100

Critical Value

(5% certainty)

.98769 .90000 .8054 .7293 .6694 .6215 .5822 .5494 .5214 .4973 .4762 .4575 .4409 .4259 .4124 .4000 .3887 .3783 .3687 .3598 .3233 .2960 .2746 .2573 .2428 .2306 .2108 .1954 .1829 .1726 .1638

SA 3.8

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download