R3: Graphics and Visualization
Exploratory Factor Analysis
0. Introduction
In social sciences (e.g., psychology), it is often not possible to measure the variables of interest directly. Examples:
• Intelligence
• Social class
Such variables are called latent variables or common factors. Researchers examine such variables indirectly, by measuring variables that can be measured and that are believed to be indicators of the latent variables of interest. Examples:
• Examination scores on various tests
• Occupation, education, home ownership
Such variables are called observed variables.
Goal: study the relationship between the latent variables and the observed variables
1. Psychological Testing Data
data(Harman74.cor)
test.cor = Harman74.cor$cov[c(6, 7, 9, 10, 12),c(6, 7, 9, 10, 12)]
colnames(test.cor) = c("PARA","SENT","WORD","ADD","COUNT")
rownames(test.cor) = colnames(test.cor)
test.cor
PARA SENT WORD ADD COUNT
PARA 1.000 0.722 0.714 0.203 0.095
SENT 0.722 1.000 0.685 0.246 0.181
WORD 0.714 0.685 1.000 0.170 0.113
ADD 0.203 0.246 0.170 1.000 0.585
COUNT 0.095 0.181 0.113 0.585 1.000
image(1:5, 1:5, test.cor, zlim=c(-1,1), col=cm.colors(21))
# Plot of correlations - magenta = positive, cyan = negative
image(1:5, 1:5, test.cor, zlim=c(-1,1), col=grey(0:20/20) )
# Similar, with white = +1, black = -1
[pic][pic]
1b. Principal components analysis
test.pc = eigen(test.cor)# Principal components analysis
test.pc$values
[1] 2.5875 1.4217 0.4152 0.3111 0.2645
plot(test.pc$values,type="o", pch=16)
abline(h=1,col="grey")
[pic]
test.pc$vectors[,1:2]
[,1] [,2]
[1,] -0.5345 -0.2449
[2,] -0.5424 -0.1641
[3,] -0.5234 -0.2470
[4,] -0.2971 0.6268
[5,] -0.2406 0.6776
test.loadings =test.pc$vector%*%diag(sqrt(test.pc$values))
test.loadings
[,1] [,2] [,3] [,4] [,5]
[1,] -0.8597723 -0.2920337 -0.07368865 -0.055092837 0.40870847
[2,] -0.8725092 -0.1957119 0.03846385 -0.368390488 -0.25146294
[3,] -0.8419192 -0.2945670 0.09276139 0.411615628 -0.16238911
[4,] -0.4779188 0.7473420 -0.45579984 0.053423069 -0.04965974
[5,] -0.3869818 0.8079858 0.43809069 -0.008495525 0.07354173
test.loadings %*% t(test.loadings) # same as the correlation test.cor
[,1] [,2] [,3] [,4] [,5]
[1,] 1.000 0.722 0.714 0.203 0.095
[2,] 0.722 1.000 0.685 0.246 0.181
[3,] 0.714 0.685 1.000 0.170 0.113
[4,] 0.203 0.246 0.170 1.000 0.585
[5,] 0.095 0.181 0.113 0.585 1.000
1c. Exploratory factor analysis – two factors
test.fa2 = factanal(covmat = test.cor, factors=2, n.obs=145)
The R function factanal() assume that X has a multivariate normal distribution and estimate the log likelihood function over the factor loading matrix and Uniqnesness to estimate the parameters (i.e. MLE estimates are obtained iteratively).
test.fa2
Call:
factanal(factors = 2, covmat = test.cor, n.obs = 145)
Uniquenesses:
PARA SENT WORD ADD COUNT
0.242 0.300 0.327 0.574 0.155
Loadings:
Factor1 Factor2
PARA 0.867
SENT 0.820 0.166
WORD 0.816
ADD 0.167 0.631
COUNT 0.918
Factor1 Factor2
SS loadings 2.119 1.282
Proportion Var 0.424 0.256
Cumulative Var 0.424 0.680
Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 0.58 on 1 degree of freedom.
The p-value is 0.446
# df(chi sq) = df(test.cor)-df(factor 1)–df(factor 2) = 10 – 5 – 4 = 1
apply(test.fa2$loadings^2,2,sum)
Factor1 Factor2
2.119212 1.281613
apply(test.fa2$loadings^2,1,sum)
PARA SENT WORD ADD COUNT
0.7575545 0.7002648 0.6727687 0.4256436 0.8445924 apply(test.fa2$loadings^2,1,sum)+test.fa2$uniqueness
PARA SENT WORD ADD COUNT
1.0000002 0.9999997 0.9999999 1.0000005 1.0000000
test.fa2$loadings%*%t(test.fa2$loadings)+diag(test.fa2$uniqueness)
# same as cor(X) after computational roundings
PARA SENT WORD ADD COUNT
PARA 1.00000018 0.7233842 0.7137061 0.1908447 0.09714164
SENT 0.72338424 0.9999997 0.6833973 0.2421572 0.18171652
WORD 0.71370606 0.6833973 0.9999999 0.1916676 0.10913971
ADD 0.19084473 0.2421572 0.1916676 1.0000005 0.58499501
COUNT 0.09714164 0.1817165 0.1091397 0.5849950 1.00000002
1d. One-factor model
test.fa1 = update(test.fa2, factors=1)
# shorthand to make minor changes to a model
test.fa1
Call:
factanal(factors = 1, covmat = test.cor, n.obs = 145)
Uniquenesses:
PARA SENT WORD ADD COUNT
0.258 0.294 0.328 0.933 0.970
Loadings:
Factor1
PARA 0.861
SENT 0.840
WORD 0.820
ADD 0.260
COUNT 0.173
Factor1
SS loadings 2.217
Proportion Var 0.443
Test of the hypothesis that 1 factor is sufficient.
The chi square statistic is 58.17 on 5 degrees of freedom.
The p-value is 2.91e-11
# df(chi sq) = df(test.cor)-df(factor 1) = 10 – 5 = 5
# small p-value; reject the one-factor model
Can you show that the chisq df is actually equals to ½[(p-c)2-p-c] where p is the number of X variables and c is the number of factors chosen?
Often sequential testing procedure is used: start with 1 factor and then increase the number of factors one at a time until test doesn’t reject the null hypothesis. It can occur that the test always rejects the null hypothesis. This is an indication that the model
does not fit well, or that the sample size is too large. Other times the sequential chi-square test tends to over-estimate the number of factors needed for a successful interpretation.
An alternative utility approach which is more computational intensive is: Perform factor analyses with various values of c, complete with rotation, and choose the smallest c that gives the most appealing structure.
2. Artificial Data (From R factanal help page)
A little demonstration, v2 is just v1 with noise, and same for v4 vs. v3 and v6 vs. v5
v1 = c(1,1,1,1,1,1,1,1,1,1,3,3,3,3,3,4,5,6)
v2 = c(1,2,1,1,1,1,2,1,2,1,3,4,3,3,3,4,6,5)
v3 = c(3,3,3,3,3,1,1,1,1,1,1,1,1,1,1,5,4,6)
v4 = c(3,3,4,3,3,1,1,2,1,1,1,1,2,1,1,5,6,4)
v5 = c(1,1,1,1,1,3,3,3,3,3,1,1,1,1,1,6,4,5)
v6 = c(1,1,1,2,1,3,3,3,4,3,1,1,1,2,1,6,5,4)
m1 = cbind(v1,v2,v3,v4,v5,v6)
pairs(m1)
pairs(m1+runif(6*18, -.3, .3)) # "jittering" to break ties
[pic] [pic]
2a. Principal components
m1.pc = prcomp(m1)
plot(m1.pc$sdev^2, type="o", pch=16)
abline(h=1,col="grey")
pairs(m1.pc$x[,1:3])
[pic][pic]
m1.pc$rotation[,1:3]
PC1 PC2 PC3
v1 0.4168 -0.52292 0.2354
v2 0.3886 -0.50888 0.2986
v3 0.4183 0.01522 -0.5555
v4 0.3944 0.02184 -0.5986
v5 0.4254 0.47017 0.2923
v6 0.4048 0.49581 0.3210
2b. Factor analysis
m1.fa1 = factanal(m1, factors=1)
m1.fa1
Call:
factanal(x = m1, factors = 1)
Uniquenesses:
v1 v2 v3 v4 v5 v6
0.773 0.792 0.733 0.795 0.022 0.085
Loadings:
Factor1
v1 0.476
v2 0.456
v3 0.517
v4 0.453
v5 0.989
v6 0.956
Factor1
SS loadings 2.800
Proportion Var 0.467
Test of the hypothesis that 1 factor is sufficient.
The chi square statistic is 53.43 on 9 degrees of freedom.
The p-value is 2.43e-08
m1.fa2 = factanal(m1, factors=2, rotation="none")
m1.fa2
Call:
factanal(x = m1, factors = 2, rotation = "none")
Uniquenesses:
v1 v2 v3 v4 v5 v6
0.005 0.114 0.642 0.742 0.005 0.097
Loadings:
Factor1 Factor2
v1 0.853 -0.518
v2 0.804 -0.490
v3 0.598
v4 0.508
v5 0.857 0.510
v6 0.796 0.519
Factor1 Factor2
SS loadings 3.358 1.038
Proportion Var 0.560 0.173
Cumulative Var 0.560 0.733
Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 23.14 on 4 degrees of freedom.
The p-value is 0.000119
m1.fa3 = factanal(m1, factors=3, rotation="none")
m1.fa3
Call:
factanal(x = m1, factors = 3, rotation = "none")
Uniquenesses:
v1 v2 v3 v4 v5 v6
0.005 0.101 0.005 0.224 0.084 0.005
Loadings:
Factor1 Factor2 Factor3
v1 0.808 -0.385 0.440
v2 0.752 -0.290 0.500
v3 0.813 -0.229 -0.530
v4 0.729 -0.139 -0.474
v5 0.802 0.521
v6 0.764 0.636
Factor1 Factor2 Factor3
SS loadings 3.638 0.980 0.957
Proportion Var 0.606 0.163 0.159
Cumulative Var 0.606 0.770 0.929
The degrees of freedom for the model is 0 and the fit was 0.4755
m1.fa3a = factanal(m1, factors=3, rotation="varimax",
scores="regression") # default rotation
m1.fa3a # Note improved interpretation of loadings
Call:
factanal(x = m1, factors = 3, scores = "regression", rotation = "varimax")
Uniquenesses:
v1 v2 v3 v4 v5 v6
0.005 0.101 0.005 0.224 0.084 0.005
Loadings:
Factor1 Factor2 Factor3
v1 0.944 0.182 0.267
v2 0.905 0.235 0.159
v3 0.236 0.210 0.946
v4 0.180 0.242 0.828
v5 0.242 0.881 0.286
v6 0.193 0.959 0.196
Factor1 Factor2 Factor3
SS loadings 1.893 1.886 1.797
Proportion Var 0.316 0.314 0.300
Cumulative Var 0.316 0.630 0.929
The degrees of freedom for the model is 0 and the fit was 0.4755
pairs(m1.fa3a$scores)
[pic]
3. Girls physical measurements data
Correlation matrix of 8 physical measurements on 305 girls between 7 and 17
data(Harman23.cor)
girls.cor = Harman23.cor$cov
girls.cor
height arm.span forearm lower.leg weight bitro.diameter
height 1.000 0.846 0.805 0.859 0.473 0.398
arm.span 0.846 1.000 0.881 0.826 0.376 0.326
forearm 0.805 0.881 1.000 0.801 0.380 0.319
lower.leg 0.859 0.826 0.801 1.000 0.436 0.329
weight 0.473 0.376 0.380 0.436 1.000 0.762
bitro.diameter 0.398 0.326 0.319 0.329 0.762 1.000
chest.girth 0.301 0.277 0.237 0.327 0.730 0.583
chest.width 0.382 0.415 0.345 0.365 0.629 0.577
chest.girth chest.width
height 0.301 0.382
arm.span 0.277 0.415
forearm 0.237 0.345
lower.leg 0.327 0.365
weight 0.730 0.629
bitro.diameter 0.583 0.577
chest.girth 1.000 0.539
chest.width 0.539 1.000
image(1:8, 1:8, girls.cor, zlim=c(-1,1), col=cm.colors(21) )
girls.pc = eigen(girls.cor) # Principal components analysis
girls.pc$values
[1] 4.67288 1.77098 0.48104 0.42144 0.23322 0.18667 0.13730 0.09646
plot(girls.pc$values,type="o", pch=16)
abline(h=1,col="grey")
[pic][pic]
rownames(girls.pc$vectors) = colnames(girls.cor)
girls.pc$vectors[,1:2]
[,1] [,2]
height -0.3976 -0.2797
arm.span -0.3893 -0.3314
forearm -0.3762 -0.3446
lower.leg -0.3884 -0.2971
weight -0.3507 0.3942
bitro.diameter -0.3119 0.4007
chest.girth -0.2855 0.4359
chest.width -0.3102 0.3144
#Rotate PC’s by varimax()
girls.pc.loadings=girls.pc$vectors %*% diag(sqrt(girls.pc$values))
summary(as.vector(girls.pc.loadings %*% t(girls.pc.loadings) - girls.cor) )
Min. 1st Qu. Median Mean 3rd Qu. Max.
-1.443e-15 -7.772e-16 -5.551e-16 -6.098e-16 -4.441e-16 0.000e+00
varimax(girls.pc.loadings[,c(1:2)])
$loadings
Loadings:
[,1] [,2]
height -0.902 0.252
arm.span -0.932 0.187
forearm -0.920 0.156
lower.leg -0.901 0.222
weight -0.258 0.885
bitro.diameter -0.188 0.839
chest.girth -0.114 0.839
chest.width -0.257 0.747
[,1] [,2]
SS loadings 3.522 2.922
Proportion Var 0.440 0.365
Cumulative Var 0.440 0.805
$rotmat
[,1] [,2]
[1,] 0.7768362 -0.6297027
[2,] 0.6297027 0.7768362
Note that:
3.522+2.922
[1] 6.444
cumsum(girls.pc$values)
[1] 4.672880 6.443862 6.924898 7.346339 7.579560 7.766233 7.903537 8.000000
girls.fa1 = factanal(covmat=girls.cor, factors=1, n.obs=305)
girls.fa1
Call:
factanal(factors = 1, covmat = girls.cor, n.obs = 305)
Uniquenesses:
height arm.span forearm lower.leg weight
0.158 0.135 0.190 0.187 0.760
bitro.diameter chest.girth chest.width
0.829 0.877 0.801
Loadings:
Factor1
height 0.918
arm.span 0.930
forearm 0.900
lower.leg 0.902
weight 0.490
bitro.diameter 0.413
chest.girth 0.351
chest.width 0.446
Factor1
SS loadings 4.064
Proportion Var 0.508
Test of the hypothesis that 1 factor is sufficient.
The chi square statistic is 611.4 on 20 degrees of freedom.
The p-value is 1.12e-116
girls.fa2 = factanal(covmat=girls.cor, factors=2, n.obs=305, rotation="none")
girls.fa2
Call:
factanal(factors = 2, covmat = girls.cor, n.obs = 305, rotation = "none")
Uniquenesses:
height arm.span forearm lower.leg weight
0.170 0.107 0.166 0.199 0.089
bitro.diameter chest.girth chest.width
0.364 0.416 0.537
Loadings:
Factor1 Factor2
height 0.880 -0.237
arm.span 0.874 -0.360
forearm 0.846 -0.344
lower.leg 0.855 -0.263
weight 0.705 0.644
bitro.diameter 0.589 0.538
chest.girth 0.526 0.554
chest.width 0.574 0.365
Factor1 Factor2
SS loadings 4.434 1.518
Proportion Var 0.554 0.190
Cumulative Var 0.554 0.744
Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 75.74 on 13 degrees of freedom.
The p-value is 6.94e-11
[pic]
girls.fa2a = factanal(covmat=girls.cor, factors=2, n.obs=305)
# Varimax rotation
girls.fa2a
Call:
factanal(factors = 2, covmat = girls.cor, n.obs = 305)
Uniquenesses:
height arm.span forearm lower.leg weight
0.170 0.107 0.166 0.199 0.089
bitro.diameter chest.girth chest.width
0.364 0.416 0.537
Loadings:
Factor1 Factor2
height 0.865 0.287
arm.span 0.927 0.181
forearm 0.895 0.179
lower.leg 0.859 0.252
weight 0.233 0.925
bitro.diameter 0.194 0.774
chest.girth 0.134 0.752
chest.width 0.278 0.621
Factor1 Factor2
SS loadings 3.335 2.617
Proportion Var 0.417 0.327
Cumulative Var 0.417 0.744
Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 75.74 on 13 degrees of freedom.
The p-value is 6.94e-11
arrows(0, 0, girls.fa2a$loadings[,1], girls.fa2a$loadings[,2],
col="red")
identify(girls.fa2a$loadings[,1], girls.fa2a$loadings[,2],
rownames(girls.fa2$loadings), col="red")
[pic]
girls.fa2b = factanal(covmat=girls.cor, factors=2, n.obs=305,
rotation="promax") # Promax rotation
girls.fa2b
Call:
factanal(factors = 2, covmat = girls.cor, n.obs = 305, rotation = "promax")
Uniquenesses:
height arm.span forearm lower.leg weight
0.170 0.107 0.166 0.199 0.089
bitro.diameter chest.girth chest.width
0.364 0.416 0.537
Loadings:
Factor1 Factor2
height 0.872
arm.span 0.973
forearm 0.938
lower.leg 0.876
weight 0.961
bitro.diameter 0.803
chest.girth 0.796
chest.width 0.125 0.611
Factor1 Factor2
SS loadings 3.375 2.589
Proportion Var 0.422 0.324
Cumulative Var 0.422 0.745
Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 75.74 on 13 degrees of freedom.
The p-value is 6.94e-11
arrows(0, 0, girls.fa2b$loadings[,1], girls.fa2b$loadings[,2], col="blue")
identify(girls.fa2b$loadings[,1], girls.fa2b$loadings[,2], rownames(girls.fa2$loadings), col="blue")
[pic]
girls.fa3 = factanal(covmat=girls.cor, factors=3, n.obs=305)
girls.fa3
Call:
factanal(factors = 3, covmat = girls.cor, n.obs = 305)
Uniquenesses:
height arm.span forearm lower.leg weight
0.127 0.005 0.193 0.157 0.090
bitro.diameter chest.girth chest.width
0.359 0.411 0.490
Loadings:
Factor1 Factor2 Factor3
height 0.886 0.267 -0.130
arm.span 0.937 0.195 0.280
forearm 0.874 0.188
lower.leg 0.877 0.230 -0.145
weight 0.242 0.916 -0.106
bitro.diameter 0.193 0.777
chest.girth 0.137 0.755
chest.width 0.261 0.646 0.159
Factor1 Factor2 Factor3
SS loadings 3.379 2.628 0.162
Proportion Var 0.422 0.329 0.020
Cumulative Var 0.422 0.751 0.771
Test of the hypothesis that 3 factors are sufficient.
The chi square statistic is 22.81 on 7 degrees of freedom.
The p-value is 0.00184
Note that although the p-value is much less significant for three factors compared to two, the third factor contributes far less to the total variance than the first two do.
3. Pain Reliever Perceptions Data (from book)
pain=read.table("")
colnames(pain) = c("No Upset Stomach", "No Side Effects",
"Stops Pain", "Works Quickly", "Keeps Me Awake",
"Limited Relief")
pain.pc = prcomp(pain, scale=T)
plot(pain.pc$sdev^2, type="o", pch=16)
abline(h=1,col="grey")
[pic]
pain.pc$rotation[,1:2]
PC1 PC2
No Upset Stomach 0.4316 -0.3595
No Side Effects 0.3808 -0.4442
Stops Pain 0.4536 0.3546
Works Quickly 0.3828 0.4407
Keeps Me Awake -0.3516 0.4699
Limited Relief -0.4392 -0.3642
pain.fa=factanal(pain, factors=2, scores="reg")
pain.fa
Call:
factanal(x = pain, factors = 2, scores = "reg")
Uniquenesses:
No Upset Stomach No Side Effects Stops Pain Works Quickly
0.434 0.344 0.346 0.365
Keeps Me Awake Limited Relief
0.365 0.392
Loadings:
Factor1 Factor2
No Upset Stomach 0.136 0.740
No Side Effects 0.810
Stops Pain 0.802 0.105
Works Quickly 0.795
Keeps Me Awake -0.796
Limited Relief -0.776
Factor1 Factor2
SS loadings 1.898 1.857
Proportion Var 0.316 0.309
Cumulative Var 0.316 0.626
Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 3.29 on 4 degrees of freedom.
The p-value is 0.511
plot(pain.fa$scores)
[pic]
4. Luxury Car Perceptions (from book)
source("readTri.txt")
colnames(car.cor) = c("Luxury", "Style", "Reliability", "Fuel Econ",
"Safety", "Maintenance", "Quality", "Durable", "Performance")
rownames(car.cor) = colnames(car.cor)
car.pc = eigen(car.cor)
car.pc$values
[1] 4.1640 1.5400 0.6857 0.5848 0.5152 0.4781 0.3736 0.3508 0.3077
plot(car.pc$values,type="o", pch=16)
abline(h=1,col="grey")
[pic]
factanal(covmat=car.cor, factors=2, n.obs=162)
Call:
factanal(factors = 2, covmat = car.cor, n.obs = 162)
Uniquenesses:
Luxury Style Reliability Fuel Econ Safety
0.164 0.546 0.440 0.625 0.573
Maintenance Quality Durable Performance
0.560 0.359 0.451 0.469
Loadings:
Factor1 Factor2
Luxury 0.914
Style 0.644 0.198
Reliability 0.387 0.640
Fuel Econ -0.101 0.604
Safety 0.620 0.204
Maintenance 0.175 0.640
Quality 0.454 0.659
Durable 0.335 0.661
Performance 0.588 0.430
Factor1 Factor2
SS loadings 2.491 2.322
Proportion Var 0.277 0.258
Cumulative Var 0.277 0.535
Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 19.6 on 19 degrees of freedom.
The p-value is 0.419
car.pc$vectors[,1:2]
[,1] [,2]
Luxury -0.3125 0.4937
Style -0.3198 0.3197
Reliability -0.3726 -0.1909
Fuel Econ -0.1894 -0.5748
Safety -0.3158 0.2930
Maintenance -0.3058 -0.3648
Quality -0.3968 -0.1092
Durable -0.3644 -0.1883
Performance -0.3768 0.1446
5. Full Psychological Test Battery
fulltest.cor = Harman74.cor$cov
image(1:24, 1:24, fulltest.cor, zlim=c(-1,1), col=cm.colors(21))
[pic] [pic]
fulltest.pc = eigen(fulltest.cor)# Principal components analysis
fulltest.pc$values
[1] 8.1354 2.0960 1.6926 1.5018 1.0252 0.9429 0.9012 0.8159 0.7902 0.7069
[11] 0.6394 0.5433 0.5330 0.5094 0.4775 0.3897 0.3820 0.3404 0.3338 0.3158
[21] 0.2972 0.2681 0.1897 0.1725
plot(fulltest.pc$values,type="o", pch=16)
abline(h=1,col="grey")
rownames(fulltest.pc$vectors) = rownames(fulltest.cor)
fulltest.pc$vectors[,1:4]
[,1] [,2] [,3] [,4]
VisualPerception -0.2159 -0.003764 -0.32875 0.16685
Cubes -0.1401 -0.054849 -0.30751 0.16446
PaperFormBoard -0.1559 -0.131808 -0.36614 0.08630
Flags -0.1790 -0.122936 -0.25729 0.17621
GeneralInformation -0.2436 -0.221950 0.25774 0.04309
PargraphComprehension -0.2421 -0.288461 0.20378 -0.06579
SentenceCompletion -0.2373 -0.293489 0.27319 0.05917
WordClassification -0.2434 -0.167723 0.11036 0.09493
WordMeaning -0.2434 -0.311224 0.22351 -0.06499
Addition -0.1662 0.374211 0.34305 0.16446
Code -0.2020 0.299629 0.16148 -0.02757
CountingDots -0.1691 0.379149 0.09735 0.27775
StraightCurvedCapitals -0.2167 0.192742 -0.02724 0.29855
WordRecognition -0.1570 0.063949 0.04238 -0.45320
NumberRecognition -0.1457 0.098312 -0.06024 -0.42914
FigureRecognition -0.1871 0.062819 -0.30098 -0.26697
ObjectNumber -0.1710 0.190302 0.04029 -0.38276
NumberFigure -0.1906 0.266841 -0.15243 -0.12392
FigureWord -0.1667 0.095217 -0.09373 -0.15777
Deduction -0.2254 -0.128716 -0.10154 -0.05720
NumericalPuzzles -0.2179 0.160439 -0.07670 0.16458
ProblemReasoning -0.2242 -0.100721 -0.08481 -0.04537
SeriesCompletion -0.2495 -0.072448 -0.11493 0.08402
ArithmeticProblems -0.2358 0.135235 0.17931 0.05043
fulltest.fa1 = factanal(covmat = fulltest.cor, factors=1, n.obs=145)
fulltest.fa1
Call:
factanal(factors = 1, covmat = fulltest.cor, n.obs = 145)
Uniquenesses:
VisualPerception Cubes PaperFormBoard
0.677 0.866 0.830
Flags GeneralInformation PargraphComprehension
0.768 0.487 0.491
SentenceCompletion WordClassification WordMeaning
0.500 0.514 0.474
Addition Code CountingDots
0.818 0.731 0.824
StraightCurvedCapitals WordRecognition NumberRecognition
0.681 0.833 0.863
FigureRecognition ObjectNumber NumberFigure
0.775 0.812 0.778
FigureWord Deduction NumericalPuzzles
0.816 0.612 0.676
ProblemReasoning SeriesCompletion ArithmeticProblems
0.619 0.524 0.593
Loadings:
Factor1
VisualPerception 0.569
Cubes 0.366
PaperFormBoard 0.412
Flags 0.482
GeneralInformation 0.716
PargraphComprehension 0.713
SentenceCompletion 0.707
WordClassification 0.697
WordMeaning 0.725
Addition 0.426
Code 0.519
CountingDots 0.419
StraightCurvedCapitals 0.565
WordRecognition 0.408
NumberRecognition 0.370
FigureRecognition 0.474
ObjectNumber 0.434
NumberFigure 0.471
FigureWord 0.429
Deduction 0.623
NumericalPuzzles 0.569
ProblemReasoning 0.617
SeriesCompletion 0.690
ArithmeticProblems 0.638
Factor1
SS loadings 7.438
Proportion Var 0.310
Test of the hypothesis that 1 factor is sufficient.
The chi square statistic is 622.9 on 252 degrees of freedom.
The p-value is 2.28e-33
update(fulltest.fa1, factors=2)
Test of the hypothesis that 2 factors are sufficient.
The chi square statistic is 420.2 on 229 degrees of freedom.
The p-value is 2.01e-13
update(fulltest.fa1, factors=3)
Test of the hypothesis that 3 factors are sufficient.
The chi square statistic is 295.6 on 207 degrees of freedom.
The p-value is 0.0000512
update(fulltest.fa1, factors=4)
Call:
factanal(factors = 4, covmat = fulltest.cor, n.obs = 145)
Uniquenesses:
VisualPerception Cubes PaperFormBoard
0.438 0.780 0.644
Flags GeneralInformation PargraphComprehension
0.651 0.352 0.312
SentenceCompletion WordClassification WordMeaning
0.283 0.485 0.257
Addition Code CountingDots
0.240 0.551 0.435
StraightCurvedCapitals WordRecognition NumberRecognition
0.491 0.646 0.696
FigureRecognition ObjectNumber NumberFigure
0.549 0.598 0.593
FigureWord Deduction NumericalPuzzles
0.762 0.592 0.583
ProblemReasoning SeriesCompletion ArithmeticProblems
0.601 0.497 0.500
Loadings:
Factor1 Factor2 Factor3 Factor4
VisualPerception 0.160 0.689 0.187 0.160
Cubes 0.117 0.436
PaperFormBoard 0.137 0.570 0.110
Flags 0.233 0.527
GeneralInformation 0.739 0.185 0.213 0.150
PargraphComprehension 0.767 0.205 0.233
SentenceCompletion 0.806 0.197 0.153
WordClassification 0.569 0.339 0.242 0.132
WordMeaning 0.806 0.201 0.227
Addition 0.167 -0.118 0.831 0.166
Code 0.180 0.120 0.512 0.374
CountingDots 0.210 0.716
StraightCurvedCapitals 0.188 0.438 0.525
WordRecognition 0.197 0.553
NumberRecognition 0.122 0.116 0.520
FigureRecognition 0.408 0.525
ObjectNumber 0.142 0.219 0.574
NumberFigure 0.293 0.336 0.456
FigureWord 0.148 0.239 0.161 0.365
Deduction 0.378 0.402 0.118 0.301
NumericalPuzzles 0.175 0.381 0.438 0.223
ProblemReasoning 0.366 0.399 0.123 0.301
SeriesCompletion 0.369 0.500 0.244 0.239
ArithmeticProblems 0.370 0.158 0.496 0.304
Factor1 Factor2 Factor3 Factor4
SS loadings 3.647 2.872 2.657 2.290
Proportion Var 0.152 0.120 0.111 0.095
Cumulative Var 0.152 0.272 0.382 0.478
Test of the hypothesis that 4 factors are sufficient.
The chi square statistic is 226.7 on 186 degrees of freedom.
The p-value is 0.0224
update(fulltest.fa1, factors=5)
Call:
factanal(factors = 5, covmat = fulltest.cor, n.obs = 145)
Uniquenesses:
VisualPerception Cubes PaperFormBoard
0.450 0.781 0.639
Flags GeneralInformation PargraphComprehension
0.649 0.357 0.288
SentenceCompletion WordClassification WordMeaning
0.277 0.485 0.262
Addition Code CountingDots
0.215 0.386 0.444
StraightCurvedCapitals WordRecognition NumberRecognition
0.256 0.639 0.706
FigureRecognition ObjectNumber NumberFigure
0.550 0.614 0.596
FigureWord Deduction NumericalPuzzles
0.764 0.521 0.564
ProblemReasoning SeriesCompletion ArithmeticProblems
0.580 0.442 0.478
Loadings:
Factor1 Factor2 Factor3 Factor4 Factor5
VisualPerception 0.161 0.658 0.136 0.182 0.199
Cubes 0.113 0.435 0.107
PaperFormBoard 0.135 0.562 0.107 0.116
Flags 0.231 0.533
GeneralInformation 0.736 0.188 0.192 0.162
PargraphComprehension 0.775 0.187 0.251 0.113
SentenceCompletion 0.809 0.208 0.136
WordClassification 0.568 0.348 0.223 0.131
WordMeaning 0.800 0.215 0.224
Addition 0.175 -0.100 0.844 0.176
Code 0.185 0.438 0.451 0.426
CountingDots 0.222 0.690 0.101 0.140
StraightCurvedCapitals 0.186 0.425 0.458 0.559
WordRecognition 0.197 0.557
NumberRecognition 0.121 0.130 0.508
FigureRecognition 0.400 0.529
ObjectNumber 0.145 0.208 0.562
NumberFigure 0.306 0.325 0.452
FigureWord 0.147 0.242 0.145 0.364
Deduction 0.370 0.452 0.139 0.287 -0.190
NumericalPuzzles 0.170 0.402 0.439 0.230
ProblemReasoning 0.358 0.423 0.126 0.302
SeriesCompletion 0.360 0.549 0.256 0.223 -0.107
ArithmeticProblems 0.371 0.185 0.502 0.307
Factor1 Factor2 Factor3 Factor4 Factor5
SS loadings 3.632 2.964 2.456 2.345 0.663
Proportion Var 0.151 0.124 0.102 0.098 0.028
Cumulative Var 0.151 0.275 0.377 0.475 0.503
Test of the hypothesis that 5 factors are sufficient.
The chi square statistic is 186.8 on 166 degrees of freedom.
The p-value is 0.128
6. Factor analysis vs. PCA
Similarities
• Both methods are mostly used in EDA (exploratory data analysis).
• Both methods try to obtain dimension reduction: explain a data set in a smaller number of new variables.
• Both methods don’t work if the observed variables are almost uncorrelated:
o Then PCA returns components that are similar to the original variables.
o Then factor analysis has nothing to explain, i.e. uniqueness will be all close to 1
• Both methods give similar results if the specific variances are small.
• If specific variances are assumed to be zero in principle factor analysis, then PCA and factor analysis are the same.
• Both PCA and FA DO NOT need Normality assumption.
Differences
• PCA required virtually no assumptions.
Factor analysis assumes that data come from a specific model structure. Normality assumption is needed in FA, however, in the case of chi-square test and MLE estimates. The principal factor analysis estimation procedure does not require normality though.
• In PCA emphasis is on transforming observed variables to principle components.
In factor analysis, emphasis is on the transformation from factors to observed variables.
• PCA is not scale invariant.
Factor analysis (with MLE) is scale invariant.
• In PCA, considering c + 1 instead of c components does not change the first c components.
In factor analysis, considering c + 1 instead of c factors may change the first c factors (when using MLE method).
• Calculation of PCA scores is straightforward.
Calculation of factor scores is more involved.
# Exploratory Factor Analysis
# Psychological Test Results Data (from book)
data(Harman74.cor)
test.cor ................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
Related searches
- army graphics and symbols powerpoint
- military symbols and graphics fm
- army graphics and symbols fm
- army fm graphics and overlays
- army symbols and graphics fm
- automotive vinyl graphics and decals
- data visualization cheat sheet
- graphics and overlay symbols powerpoint
- army graphics and symbols generator
- data visualization in r
- army operational graphics and symbols
- 3d visualization python