R3: Graphics and Visualization



Exploratory Factor Analysis

0. Introduction

In social sciences (e.g., psychology), it is often not possible to measure the variables of interest directly. Examples:

• Intelligence

• Social class

Such variables are called latent variables or common factors. Researchers examine such variables indirectly, by measuring variables that can be measured and that are believed to be indicators of the latent variables of interest. Examples:

• Examination scores on various tests

• Occupation, education, home ownership

Such variables are called observed variables.

Goal: study the relationship between the latent variables and the observed variables

1. Psychological Testing Data

data(Harman74.cor)

test.cor = Harman74.cor$cov[c(6, 7, 9, 10, 12),c(6, 7, 9, 10, 12)]

colnames(test.cor) = c("PARA","SENT","WORD","ADD","COUNT")

rownames(test.cor) = colnames(test.cor)

test.cor

PARA SENT WORD ADD COUNT

PARA 1.000 0.722 0.714 0.203 0.095

SENT 0.722 1.000 0.685 0.246 0.181

WORD 0.714 0.685 1.000 0.170 0.113

ADD 0.203 0.246 0.170 1.000 0.585

COUNT 0.095 0.181 0.113 0.585 1.000

image(1:5, 1:5, test.cor, zlim=c(-1,1), col=cm.colors(21))

# Plot of correlations - magenta = positive, cyan = negative

image(1:5, 1:5, test.cor, zlim=c(-1,1), col=grey(0:20/20) )

# Similar, with white = +1, black = -1

[pic][pic]

1b. Principal components analysis

test.pc = eigen(test.cor)# Principal components analysis

test.pc$values

[1] 2.5875 1.4217 0.4152 0.3111 0.2645

plot(test.pc$values,type="o", pch=16)

abline(h=1,col="grey")

[pic]

test.pc$vectors[,1:2]

[,1] [,2]

[1,] -0.5345 -0.2449

[2,] -0.5424 -0.1641

[3,] -0.5234 -0.2470

[4,] -0.2971 0.6268

[5,] -0.2406 0.6776

test.loadings =test.pc$vector%*%diag(sqrt(test.pc$values))

test.loadings

[,1] [,2] [,3] [,4] [,5]

[1,] -0.8597723 -0.2920337 -0.07368865 -0.055092837 0.40870847

[2,] -0.8725092 -0.1957119 0.03846385 -0.368390488 -0.25146294

[3,] -0.8419192 -0.2945670 0.09276139 0.411615628 -0.16238911

[4,] -0.4779188 0.7473420 -0.45579984 0.053423069 -0.04965974

[5,] -0.3869818 0.8079858 0.43809069 -0.008495525 0.07354173

test.loadings %*% t(test.loadings) # same as the correlation test.cor

[,1] [,2] [,3] [,4] [,5]

[1,] 1.000 0.722 0.714 0.203 0.095

[2,] 0.722 1.000 0.685 0.246 0.181

[3,] 0.714 0.685 1.000 0.170 0.113

[4,] 0.203 0.246 0.170 1.000 0.585

[5,] 0.095 0.181 0.113 0.585 1.000

1c. Exploratory factor analysis – two factors

test.fa2 = factanal(covmat = test.cor, factors=2, n.obs=145)

The R function factanal() assume that X has a multivariate normal distribution and estimate the log likelihood function over the factor loading matrix and Uniqnesness to estimate the parameters (i.e. MLE estimates are obtained iteratively).

test.fa2

Call:

factanal(factors = 2, covmat = test.cor, n.obs = 145)

Uniquenesses:

PARA SENT WORD ADD COUNT

0.242 0.300 0.327 0.574 0.155

Loadings:

Factor1 Factor2

PARA 0.867

SENT 0.820 0.166

WORD 0.816

ADD 0.167 0.631

COUNT 0.918

Factor1 Factor2

SS loadings 2.119 1.282

Proportion Var 0.424 0.256

Cumulative Var 0.424 0.680

Test of the hypothesis that 2 factors are sufficient.

The chi square statistic is 0.58 on 1 degree of freedom.

The p-value is 0.446

# df(chi sq) = df(test.cor)-df(factor 1)–df(factor 2) = 10 – 5 – 4 = 1

apply(test.fa2$loadings^2,2,sum)

Factor1 Factor2

2.119212 1.281613

apply(test.fa2$loadings^2,1,sum)

PARA SENT WORD ADD COUNT

0.7575545 0.7002648 0.6727687 0.4256436 0.8445924 apply(test.fa2$loadings^2,1,sum)+test.fa2$uniqueness

PARA SENT WORD ADD COUNT

1.0000002 0.9999997 0.9999999 1.0000005 1.0000000

test.fa2$loadings%*%t(test.fa2$loadings)+diag(test.fa2$uniqueness)

# same as cor(X) after computational roundings

PARA SENT WORD ADD COUNT

PARA 1.00000018 0.7233842 0.7137061 0.1908447 0.09714164

SENT 0.72338424 0.9999997 0.6833973 0.2421572 0.18171652

WORD 0.71370606 0.6833973 0.9999999 0.1916676 0.10913971

ADD 0.19084473 0.2421572 0.1916676 1.0000005 0.58499501

COUNT 0.09714164 0.1817165 0.1091397 0.5849950 1.00000002

1d. One-factor model

test.fa1 = update(test.fa2, factors=1)

# shorthand to make minor changes to a model

test.fa1

Call:

factanal(factors = 1, covmat = test.cor, n.obs = 145)

Uniquenesses:

PARA SENT WORD ADD COUNT

0.258 0.294 0.328 0.933 0.970

Loadings:

Factor1

PARA 0.861

SENT 0.840

WORD 0.820

ADD 0.260

COUNT 0.173

Factor1

SS loadings 2.217

Proportion Var 0.443

Test of the hypothesis that 1 factor is sufficient.

The chi square statistic is 58.17 on 5 degrees of freedom.

The p-value is 2.91e-11

# df(chi sq) = df(test.cor)-df(factor 1) = 10 – 5 = 5

# small p-value; reject the one-factor model

Can you show that the chisq df is actually equals to ½[(p-c)2-p-c] where p is the number of X variables and c is the number of factors chosen?

Often sequential testing procedure is used: start with 1 factor and then increase the number of factors one at a time until test doesn’t reject the null hypothesis. It can occur that the test always rejects the null hypothesis. This is an indication that the model

does not fit well, or that the sample size is too large. Other times the sequential chi-square test tends to over-estimate the number of factors needed for a successful interpretation.

An alternative utility approach which is more computational intensive is: Perform factor analyses with various values of c, complete with rotation, and choose the smallest c that gives the most appealing structure.

2. Artificial Data (From R factanal help page)

A little demonstration, v2 is just v1 with noise, and same for v4 vs. v3 and v6 vs. v5

v1 = c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6)

v2 = c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5)

v3 = c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6)

v4 = c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4)

v5 = c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5)

v6 = c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4)

m1 = cbind(v1,v2,v3,v4,v5,v6)

pairs(m1)

pairs(m1+runif(6*18, -.3, .3)) # "jittering" to break ties

[pic] [pic]

2a. Principal components

m1.pc = prcomp(m1)

plot(m1.pc$sdev^2, type="o", pch=16)

abline(h=1,col="grey")

pairs(m1.pc$x[,1:3])

[pic][pic]

m1.pc$rotation[,1:3]

PC1 PC2 PC3

v1 0.4168 -0.52292 0.2354

v2 0.3886 -0.50888 0.2986

v3 0.4183 0.01522 -0.5555

v4 0.3944 0.02184 -0.5986

v5 0.4254 0.47017 0.2923

v6 0.4048 0.49581 0.3210

2b. Factor analysis

m1.fa1 = factanal(m1, factors=1)

m1.fa1

Call:

factanal(x = m1, factors = 1)

Uniquenesses:

v1 v2 v3 v4 v5 v6

0.773 0.792 0.733 0.795 0.022 0.085

Loadings:

Factor1

v1 0.476

v2 0.456

v3 0.517

v4 0.453

v5 0.989

v6 0.956

Factor1

SS loadings 2.800

Proportion Var 0.467

Test of the hypothesis that 1 factor is sufficient.

The chi square statistic is 53.43 on 9 degrees of freedom.

The p-value is 2.43e-08

m1.fa2 = factanal(m1, factors=2, rotation="none")

m1.fa2

Call:

factanal(x = m1, factors = 2, rotation = "none")

Uniquenesses:

v1 v2 v3 v4 v5 v6

0.005 0.114 0.642 0.742 0.005 0.097

Loadings:

Factor1 Factor2

v1 0.853 -0.518

v2 0.804 -0.490

v3 0.598

v4 0.508

v5 0.857 0.510

v6 0.796 0.519

Factor1 Factor2

SS loadings 3.358 1.038

Proportion Var 0.560 0.173

Cumulative Var 0.560 0.733

Test of the hypothesis that 2 factors are sufficient.

The chi square statistic is 23.14 on 4 degrees of freedom.

The p-value is 0.000119

m1.fa3 = factanal(m1, factors=3, rotation="none")

m1.fa3

Call:

factanal(x = m1, factors = 3, rotation = "none")

Uniquenesses:

v1 v2 v3 v4 v5 v6

0.005 0.101 0.005 0.224 0.084 0.005

Loadings:

Factor1 Factor2 Factor3

v1 0.808 -0.385 0.440

v2 0.752 -0.290 0.500

v3 0.813 -0.229 -0.530

v4 0.729 -0.139 -0.474

v5 0.802 0.521

v6 0.764 0.636

Factor1 Factor2 Factor3

SS loadings 3.638 0.980 0.957

Proportion Var 0.606 0.163 0.159

Cumulative Var 0.606 0.770 0.929

The degrees of freedom for the model is 0 and the fit was 0.4755

m1.fa3a = factanal(m1, factors=3, rotation="varimax",

scores="regression") # default rotation

m1.fa3a # Note improved interpretation of loadings

Call:

factanal(x = m1, factors = 3, scores = "regression", rotation = "varimax")

Uniquenesses:

v1 v2 v3 v4 v5 v6

0.005 0.101 0.005 0.224 0.084 0.005

Loadings:

Factor1 Factor2 Factor3

v1 0.944 0.182 0.267

v2 0.905 0.235 0.159

v3 0.236 0.210 0.946

v4 0.180 0.242 0.828

v5 0.242 0.881 0.286

v6 0.193 0.959 0.196

Factor1 Factor2 Factor3

SS loadings 1.893 1.886 1.797

Proportion Var 0.316 0.314 0.300

Cumulative Var 0.316 0.630 0.929

The degrees of freedom for the model is 0 and the fit was 0.4755

pairs(m1.fa3a$scores)

[pic]

3. Girls physical measurements data

Correlation matrix of 8 physical measurements on 305 girls between 7 and 17

data(Harman23.cor)

girls.cor = Harman23.cor$cov

girls.cor

height arm.span forearm lower.leg weight bitro.diameter

height 1.000 0.846 0.805 0.859 0.473 0.398

arm.span 0.846 1.000 0.881 0.826 0.376 0.326

forearm 0.805 0.881 1.000 0.801 0.380 0.319

lower.leg 0.859 0.826 0.801 1.000 0.436 0.329

weight 0.473 0.376 0.380 0.436 1.000 0.762

bitro.diameter 0.398 0.326 0.319 0.329 0.762 1.000

chest.girth 0.301 0.277 0.237 0.327 0.730 0.583

chest.width 0.382 0.415 0.345 0.365 0.629 0.577

chest.girth chest.width

height 0.301 0.382

arm.span 0.277 0.415

forearm 0.237 0.345

lower.leg 0.327 0.365

weight 0.730 0.629

bitro.diameter 0.583 0.577

chest.girth 1.000 0.539

chest.width 0.539 1.000

image(1:8, 1:8, girls.cor, zlim=c(-1,1), col=cm.colors(21) )

girls.pc = eigen(girls.cor) # Principal components analysis

girls.pc$values

[1] 4.67288 1.77098 0.48104 0.42144 0.23322 0.18667 0.13730 0.09646

plot(girls.pc$values,type="o", pch=16)

abline(h=1,col="grey")

[pic][pic]

rownames(girls.pc$vectors) = colnames(girls.cor)

girls.pc$vectors[,1:2]

[,1] [,2]

height -0.3976 -0.2797

arm.span -0.3893 -0.3314

forearm -0.3762 -0.3446

lower.leg -0.3884 -0.2971

weight -0.3507 0.3942

bitro.diameter -0.3119 0.4007

chest.girth -0.2855 0.4359

chest.width -0.3102 0.3144

#Rotate PC’s by varimax()

girls.pc.loadings=girls.pc$vectors %*% diag(sqrt(girls.pc$values))

summary(as.vector(girls.pc.loadings %*% t(girls.pc.loadings) - girls.cor) )

Min. 1st Qu. Median Mean 3rd Qu. Max.

-1.443e-15 -7.772e-16 -5.551e-16 -6.098e-16 -4.441e-16 0.000e+00

varimax(girls.pc.loadings[,c(1:2)])

$loadings

Loadings:

[,1] [,2]

height -0.902 0.252

arm.span -0.932 0.187

forearm -0.920 0.156

lower.leg -0.901 0.222

weight -0.258 0.885

bitro.diameter -0.188 0.839

chest.girth -0.114 0.839

chest.width -0.257 0.747

[,1] [,2]

SS loadings 3.522 2.922

Proportion Var 0.440 0.365

Cumulative Var 0.440 0.805

$rotmat

[,1] [,2]

[1,] 0.7768362 -0.6297027

[2,] 0.6297027 0.7768362

Note that:

3.522+2.922

[1] 6.444

cumsum(girls.pc$values)

[1] 4.672880 6.443862 6.924898 7.346339 7.579560 7.766233 7.903537 8.000000

girls.fa1 = factanal(covmat=girls.cor, factors=1, n.obs=305)

girls.fa1

Call:

factanal(factors = 1, covmat = girls.cor, n.obs = 305)

Uniquenesses:

height arm.span forearm lower.leg weight

0.158 0.135 0.190 0.187 0.760

bitro.diameter chest.girth chest.width

0.829 0.877 0.801

Loadings:

Factor1

height 0.918

arm.span 0.930

forearm 0.900

lower.leg 0.902

weight 0.490

bitro.diameter 0.413

chest.girth 0.351

chest.width 0.446

Factor1

SS loadings 4.064

Proportion Var 0.508

Test of the hypothesis that 1 factor is sufficient.

The chi square statistic is 611.4 on 20 degrees of freedom.

The p-value is 1.12e-116

girls.fa2 = factanal(covmat=girls.cor, factors=2, n.obs=305, rotation="none")

girls.fa2

Call:

factanal(factors = 2, covmat = girls.cor, n.obs = 305, rotation = "none")

Uniquenesses:

height arm.span forearm lower.leg weight

0.170 0.107 0.166 0.199 0.089

bitro.diameter chest.girth chest.width

0.364 0.416 0.537

Loadings:

Factor1 Factor2

height 0.880 -0.237

arm.span 0.874 -0.360

forearm 0.846 -0.344

lower.leg 0.855 -0.263

weight 0.705 0.644

bitro.diameter 0.589 0.538

chest.girth 0.526 0.554

chest.width 0.574 0.365

Factor1 Factor2

SS loadings 4.434 1.518

Proportion Var 0.554 0.190

Cumulative Var 0.554 0.744

Test of the hypothesis that 2 factors are sufficient.

The chi square statistic is 75.74 on 13 degrees of freedom.

The p-value is 6.94e-11

[pic]

girls.fa2a = factanal(covmat=girls.cor, factors=2, n.obs=305)

# Varimax rotation

girls.fa2a

Call:

factanal(factors = 2, covmat = girls.cor, n.obs = 305)

Uniquenesses:

height arm.span forearm lower.leg weight

0.170 0.107 0.166 0.199 0.089

bitro.diameter chest.girth chest.width

0.364 0.416 0.537

Loadings:

Factor1 Factor2

height 0.865 0.287

arm.span 0.927 0.181

forearm 0.895 0.179

lower.leg 0.859 0.252

weight 0.233 0.925

bitro.diameter 0.194 0.774

chest.girth 0.134 0.752

chest.width 0.278 0.621

Factor1 Factor2

SS loadings 3.335 2.617

Proportion Var 0.417 0.327

Cumulative Var 0.417 0.744

Test of the hypothesis that 2 factors are sufficient.

The chi square statistic is 75.74 on 13 degrees of freedom.

The p-value is 6.94e-11

arrows(0, 0, girls.fa2a$loadings[,1], girls.fa2a$loadings[,2],

col="red")

identify(girls.fa2a$loadings[,1], girls.fa2a$loadings[,2],

rownames(girls.fa2$loadings), col="red")

[pic]

girls.fa2b = factanal(covmat=girls.cor, factors=2, n.obs=305,

rotation="promax") # Promax rotation

girls.fa2b

Call:

factanal(factors = 2, covmat = girls.cor, n.obs = 305, rotation = "promax")

Uniquenesses:

height arm.span forearm lower.leg weight

0.170 0.107 0.166 0.199 0.089

bitro.diameter chest.girth chest.width

0.364 0.416 0.537

Loadings:

Factor1 Factor2

height 0.872

arm.span 0.973

forearm 0.938

lower.leg 0.876

weight 0.961

bitro.diameter 0.803

chest.girth 0.796

chest.width 0.125 0.611

Factor1 Factor2

SS loadings 3.375 2.589

Proportion Var 0.422 0.324

Cumulative Var 0.422 0.745

Test of the hypothesis that 2 factors are sufficient.

The chi square statistic is 75.74 on 13 degrees of freedom.

The p-value is 6.94e-11

arrows(0, 0, girls.fa2b$loadings[,1], girls.fa2b$loadings[,2], col="blue")

identify(girls.fa2b$loadings[,1], girls.fa2b$loadings[,2], rownames(girls.fa2$loadings), col="blue")

[pic]

girls.fa3 = factanal(covmat=girls.cor, factors=3, n.obs=305)

girls.fa3

Call:

factanal(factors = 3, covmat = girls.cor, n.obs = 305)

Uniquenesses:

height arm.span forearm lower.leg weight

0.127 0.005 0.193 0.157 0.090

bitro.diameter chest.girth chest.width

0.359 0.411 0.490

Loadings:

Factor1 Factor2 Factor3

height 0.886 0.267 -0.130

arm.span 0.937 0.195 0.280

forearm 0.874 0.188

lower.leg 0.877 0.230 -0.145

weight 0.242 0.916 -0.106

bitro.diameter 0.193 0.777

chest.girth 0.137 0.755

chest.width 0.261 0.646 0.159

Factor1 Factor2 Factor3

SS loadings 3.379 2.628 0.162

Proportion Var 0.422 0.329 0.020

Cumulative Var 0.422 0.751 0.771

Test of the hypothesis that 3 factors are sufficient.

The chi square statistic is 22.81 on 7 degrees of freedom.

The p-value is 0.00184

Note that although the p-value is much less significant for three factors compared to two, the third factor contributes far less to the total variance than the first two do.

3. Pain Reliever Perceptions Data (from book)

pain=read.table("")

colnames(pain) = c("No Upset Stomach", "No Side Effects",

"Stops Pain", "Works Quickly", "Keeps Me Awake",

"Limited Relief")

pain.pc = prcomp(pain, scale=T)

plot(pain.pc$sdev^2, type="o", pch=16)

abline(h=1,col="grey")

[pic]

pain.pc$rotation[,1:2]

PC1 PC2

No Upset Stomach 0.4316 -0.3595

No Side Effects 0.3808 -0.4442

Stops Pain 0.4536 0.3546

Works Quickly 0.3828 0.4407

Keeps Me Awake -0.3516 0.4699

Limited Relief -0.4392 -0.3642

pain.fa=factanal(pain, factors=2, scores="reg")

pain.fa

Call:

factanal(x = pain, factors = 2, scores = "reg")

Uniquenesses:

No Upset Stomach No Side Effects Stops Pain Works Quickly

0.434 0.344 0.346 0.365

Keeps Me Awake Limited Relief

0.365 0.392

Loadings:

Factor1 Factor2

No Upset Stomach 0.136 0.740

No Side Effects 0.810

Stops Pain 0.802 0.105

Works Quickly 0.795

Keeps Me Awake -0.796

Limited Relief -0.776

Factor1 Factor2

SS loadings 1.898 1.857

Proportion Var 0.316 0.309

Cumulative Var 0.316 0.626

Test of the hypothesis that 2 factors are sufficient.

The chi square statistic is 3.29 on 4 degrees of freedom.

The p-value is 0.511

plot(pain.fa$scores)

[pic]

4. Luxury Car Perceptions (from book)

source("readTri.txt")

colnames(car.cor) = c("Luxury", "Style", "Reliability", "Fuel Econ",

"Safety", "Maintenance", "Quality", "Durable", "Performance")

rownames(car.cor) = colnames(car.cor)

car.pc = eigen(car.cor)

car.pc$values

[1] 4.1640 1.5400 0.6857 0.5848 0.5152 0.4781 0.3736 0.3508 0.3077

plot(car.pc$values,type="o", pch=16)

abline(h=1,col="grey")

[pic]

factanal(covmat=car.cor, factors=2, n.obs=162)

Call:

factanal(factors = 2, covmat = car.cor, n.obs = 162)

Uniquenesses:

Luxury Style Reliability Fuel Econ Safety

0.164 0.546 0.440 0.625 0.573

Maintenance Quality Durable Performance

0.560 0.359 0.451 0.469

Loadings:

Factor1 Factor2

Luxury 0.914

Style 0.644 0.198

Reliability 0.387 0.640

Fuel Econ -0.101 0.604

Safety 0.620 0.204

Maintenance 0.175 0.640

Quality 0.454 0.659

Durable 0.335 0.661

Performance 0.588 0.430

Factor1 Factor2

SS loadings 2.491 2.322

Proportion Var 0.277 0.258

Cumulative Var 0.277 0.535

Test of the hypothesis that 2 factors are sufficient.

The chi square statistic is 19.6 on 19 degrees of freedom.

The p-value is 0.419

car.pc$vectors[,1:2]

[,1] [,2]

Luxury -0.3125 0.4937

Style -0.3198 0.3197

Reliability -0.3726 -0.1909

Fuel Econ -0.1894 -0.5748

Safety -0.3158 0.2930

Maintenance -0.3058 -0.3648

Quality -0.3968 -0.1092

Durable -0.3644 -0.1883

Performance -0.3768 0.1446

5. Full Psychological Test Battery

fulltest.cor = Harman74.cor$cov

image(1:24, 1:24, fulltest.cor, zlim=c(-1,1), col=cm.colors(21))

[pic] [pic]

fulltest.pc = eigen(fulltest.cor)# Principal components analysis

fulltest.pc$values

[1] 8.1354 2.0960 1.6926 1.5018 1.0252 0.9429 0.9012 0.8159 0.7902 0.7069

[11] 0.6394 0.5433 0.5330 0.5094 0.4775 0.3897 0.3820 0.3404 0.3338 0.3158

[21] 0.2972 0.2681 0.1897 0.1725

plot(fulltest.pc$values,type="o", pch=16)

abline(h=1,col="grey")

rownames(fulltest.pc$vectors) = rownames(fulltest.cor)

fulltest.pc$vectors[,1:4]

[,1] [,2] [,3] [,4]

VisualPerception -0.2159 -0.003764 -0.32875 0.16685

Cubes -0.1401 -0.054849 -0.30751 0.16446

PaperFormBoard -0.1559 -0.131808 -0.36614 0.08630

Flags -0.1790 -0.122936 -0.25729 0.17621

GeneralInformation -0.2436 -0.221950 0.25774 0.04309

PargraphComprehension -0.2421 -0.288461 0.20378 -0.06579

SentenceCompletion -0.2373 -0.293489 0.27319 0.05917

WordClassification -0.2434 -0.167723 0.11036 0.09493

WordMeaning -0.2434 -0.311224 0.22351 -0.06499

Addition -0.1662 0.374211 0.34305 0.16446

Code -0.2020 0.299629 0.16148 -0.02757

CountingDots -0.1691 0.379149 0.09735 0.27775

StraightCurvedCapitals -0.2167 0.192742 -0.02724 0.29855

WordRecognition -0.1570 0.063949 0.04238 -0.45320

NumberRecognition -0.1457 0.098312 -0.06024 -0.42914

FigureRecognition -0.1871 0.062819 -0.30098 -0.26697

ObjectNumber -0.1710 0.190302 0.04029 -0.38276

NumberFigure -0.1906 0.266841 -0.15243 -0.12392

FigureWord -0.1667 0.095217 -0.09373 -0.15777

Deduction -0.2254 -0.128716 -0.10154 -0.05720

NumericalPuzzles -0.2179 0.160439 -0.07670 0.16458

ProblemReasoning -0.2242 -0.100721 -0.08481 -0.04537

SeriesCompletion -0.2495 -0.072448 -0.11493 0.08402

ArithmeticProblems -0.2358 0.135235 0.17931 0.05043

fulltest.fa1 = factanal(covmat = fulltest.cor, factors=1, n.obs=145)

fulltest.fa1

Call:

factanal(factors = 1, covmat = fulltest.cor, n.obs = 145)

Uniquenesses:

VisualPerception Cubes PaperFormBoard

0.677 0.866 0.830

Flags GeneralInformation PargraphComprehension

0.768 0.487 0.491

SentenceCompletion WordClassification WordMeaning

0.500 0.514 0.474

Addition Code CountingDots

0.818 0.731 0.824

StraightCurvedCapitals WordRecognition NumberRecognition

0.681 0.833 0.863

FigureRecognition ObjectNumber NumberFigure

0.775 0.812 0.778

FigureWord Deduction NumericalPuzzles

0.816 0.612 0.676

ProblemReasoning SeriesCompletion ArithmeticProblems

0.619 0.524 0.593

Loadings:

Factor1

VisualPerception 0.569

Cubes 0.366

PaperFormBoard 0.412

Flags 0.482

GeneralInformation 0.716

PargraphComprehension 0.713

SentenceCompletion 0.707

WordClassification 0.697

WordMeaning 0.725

Addition 0.426

Code 0.519

CountingDots 0.419

StraightCurvedCapitals 0.565

WordRecognition 0.408

NumberRecognition 0.370

FigureRecognition 0.474

ObjectNumber 0.434

NumberFigure 0.471

FigureWord 0.429

Deduction 0.623

NumericalPuzzles 0.569

ProblemReasoning 0.617

SeriesCompletion 0.690

ArithmeticProblems 0.638

Factor1

SS loadings 7.438

Proportion Var 0.310

Test of the hypothesis that 1 factor is sufficient.

The chi square statistic is 622.9 on 252 degrees of freedom.

The p-value is 2.28e-33

update(fulltest.fa1, factors=2)

Test of the hypothesis that 2 factors are sufficient.

The chi square statistic is 420.2 on 229 degrees of freedom.

The p-value is 2.01e-13

update(fulltest.fa1, factors=3)

Test of the hypothesis that 3 factors are sufficient.

The chi square statistic is 295.6 on 207 degrees of freedom.

The p-value is 0.0000512

update(fulltest.fa1, factors=4)

Call:

factanal(factors = 4, covmat = fulltest.cor, n.obs = 145)

Uniquenesses:

VisualPerception Cubes PaperFormBoard

0.438 0.780 0.644

Flags GeneralInformation PargraphComprehension

0.651 0.352 0.312

SentenceCompletion WordClassification WordMeaning

0.283 0.485 0.257

Addition Code CountingDots

0.240 0.551 0.435

StraightCurvedCapitals WordRecognition NumberRecognition

0.491 0.646 0.696

FigureRecognition ObjectNumber NumberFigure

0.549 0.598 0.593

FigureWord Deduction NumericalPuzzles

0.762 0.592 0.583

ProblemReasoning SeriesCompletion ArithmeticProblems

0.601 0.497 0.500

Loadings:

Factor1 Factor2 Factor3 Factor4

VisualPerception 0.160 0.689 0.187 0.160

Cubes 0.117 0.436

PaperFormBoard 0.137 0.570 0.110

Flags 0.233 0.527

GeneralInformation 0.739 0.185 0.213 0.150

PargraphComprehension 0.767 0.205 0.233

SentenceCompletion 0.806 0.197 0.153

WordClassification 0.569 0.339 0.242 0.132

WordMeaning 0.806 0.201 0.227

Addition 0.167 -0.118 0.831 0.166

Code 0.180 0.120 0.512 0.374

CountingDots 0.210 0.716

StraightCurvedCapitals 0.188 0.438 0.525

WordRecognition 0.197 0.553

NumberRecognition 0.122 0.116 0.520

FigureRecognition 0.408 0.525

ObjectNumber 0.142 0.219 0.574

NumberFigure 0.293 0.336 0.456

FigureWord 0.148 0.239 0.161 0.365

Deduction 0.378 0.402 0.118 0.301

NumericalPuzzles 0.175 0.381 0.438 0.223

ProblemReasoning 0.366 0.399 0.123 0.301

SeriesCompletion 0.369 0.500 0.244 0.239

ArithmeticProblems 0.370 0.158 0.496 0.304

Factor1 Factor2 Factor3 Factor4

SS loadings 3.647 2.872 2.657 2.290

Proportion Var 0.152 0.120 0.111 0.095

Cumulative Var 0.152 0.272 0.382 0.478

Test of the hypothesis that 4 factors are sufficient.

The chi square statistic is 226.7 on 186 degrees of freedom.

The p-value is 0.0224

update(fulltest.fa1, factors=5)

Call:

factanal(factors = 5, covmat = fulltest.cor, n.obs = 145)

Uniquenesses:

VisualPerception Cubes PaperFormBoard

0.450 0.781 0.639

Flags GeneralInformation PargraphComprehension

0.649 0.357 0.288

SentenceCompletion WordClassification WordMeaning

0.277 0.485 0.262

Addition Code CountingDots

0.215 0.386 0.444

StraightCurvedCapitals WordRecognition NumberRecognition

0.256 0.639 0.706

FigureRecognition ObjectNumber NumberFigure

0.550 0.614 0.596

FigureWord Deduction NumericalPuzzles

0.764 0.521 0.564

ProblemReasoning SeriesCompletion ArithmeticProblems

0.580 0.442 0.478

Loadings:

Factor1 Factor2 Factor3 Factor4 Factor5

VisualPerception 0.161 0.658 0.136 0.182 0.199

Cubes 0.113 0.435 0.107

PaperFormBoard 0.135 0.562 0.107 0.116

Flags 0.231 0.533

GeneralInformation 0.736 0.188 0.192 0.162

PargraphComprehension 0.775 0.187 0.251 0.113

SentenceCompletion 0.809 0.208 0.136

WordClassification 0.568 0.348 0.223 0.131

WordMeaning 0.800 0.215 0.224

Addition 0.175 -0.100 0.844 0.176

Code 0.185 0.438 0.451 0.426

CountingDots 0.222 0.690 0.101 0.140

StraightCurvedCapitals 0.186 0.425 0.458 0.559

WordRecognition 0.197 0.557

NumberRecognition 0.121 0.130 0.508

FigureRecognition 0.400 0.529

ObjectNumber 0.145 0.208 0.562

NumberFigure 0.306 0.325 0.452

FigureWord 0.147 0.242 0.145 0.364

Deduction 0.370 0.452 0.139 0.287 -0.190

NumericalPuzzles 0.170 0.402 0.439 0.230

ProblemReasoning 0.358 0.423 0.126 0.302

SeriesCompletion 0.360 0.549 0.256 0.223 -0.107

ArithmeticProblems 0.371 0.185 0.502 0.307

Factor1 Factor2 Factor3 Factor4 Factor5

SS loadings 3.632 2.964 2.456 2.345 0.663

Proportion Var 0.151 0.124 0.102 0.098 0.028

Cumulative Var 0.151 0.275 0.377 0.475 0.503

Test of the hypothesis that 5 factors are sufficient.

The chi square statistic is 186.8 on 166 degrees of freedom.

The p-value is 0.128

6. Factor analysis vs. PCA

Similarities

• Both methods are mostly used in EDA (exploratory data analysis).

• Both methods try to obtain dimension reduction: explain a data set in a smaller number of new variables.

• Both methods don’t work if the observed variables are almost uncorrelated:

o Then PCA returns components that are similar to the original variables.

o Then factor analysis has nothing to explain, i.e. uniqueness will be all close to 1

• Both methods give similar results if the specific variances are small.

• If specific variances are assumed to be zero in principle factor analysis, then PCA and factor analysis are the same.

• Both PCA and FA DO NOT need Normality assumption.

Differences

• PCA required virtually no assumptions.

Factor analysis assumes that data come from a specific model structure. Normality assumption is needed in FA, however, in the case of chi-square test and MLE estimates. The principal factor analysis estimation procedure does not require normality though.

• In PCA emphasis is on transforming observed variables to principle components.

In factor analysis, emphasis is on the transformation from factors to observed variables.

• PCA is not scale invariant.

Factor analysis (with MLE) is scale invariant.

• In PCA, considering c + 1 instead of c components does not change the first c components.

In factor analysis, considering c + 1 instead of c factors may change the first c factors (when using MLE method).

• Calculation of PCA scores is straightforward.

Calculation of factor scores is more involved.

# Exploratory Factor Analysis

# Psychological Test Results Data (from book)

data(Harman74.cor)

test.cor ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download