Chapter 1



Cluster Analysis

True/False Questions

1. Cluster analysis does not classify variables as dependent or independent.

(True,

2. Cluster analysis is the obverse of factor analysis in that it reduces the number of objects, not the number of variables, by grouping them into a much smaller number of clusters.

(True,

3. If cluster analysis is used as a general data reduction tool, subsequent multivariate analysis can be conducted on the clusters rather than on the individual observations.

(True,

4. The dendrogram is read from right to left.

(False,

5. Clustering should be done on samples of 300 or more.

(False,

6. In cluster analysis, objects with larger distances between them are more similar to each other than are those at smaller distances.

(False,

7. The average linkage method of hierarchical clustering is preferred to the single and complete linkage methods.

(True,

8. The centroid method is a variance method of hierarchical clustering in which the distance between two clusters is the distance between their centroids (means for all the variables).

(True,

9. Nonhierarchical clustering is faster than hierarchical methods.

(True,

10. It is helpful to profile the clusters in terms of variables that were not used for clustering.

(True,

11. One method of assessing reliability and validity of clustering is to use different methods of clustering and compare the results.

(True,

12. To reduce the number of variables, a large set of variables can often be replaced by the set of cluster components.

(True,

Multiple Choice Questions

25. Which method of analysis does not classify variables as dependent or independent?

a. regression analysis

b. discriminant analysis

c. analysis of variance

d. cluster analysis

(d,

26. Which statement is not true about cluster analysis?

a. Objects in each cluster tend to be similar to each other and dissimilar to objects in the other clusters.

b. Cluster analysis is also called classification analysis or numerical taxonomy.

c. Groups or clusters are suggested by the data, not defined a priori.

d. Cluster analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the independent variables are interval in nature.

(d,

30. A _____ or tree graph is a graphical device for displaying clustering results. Vertical lines represent clusters that are joined together. The position of the line on the scale indicates the distances at which clusters were joined.

a. dendrogram

b. scattergram

c. scree plot

d. icicle diagram

(a,

31. The most important part of _____ is selecting the variables on which clustering is based.

a. interpreting and profiling clusters

b. selecting a clustering procedure

c. assessing the validity of clustering

d. formulating the clustering problem

(d,

32. The most commonly used measure of similarity is the _____ or its square.

a. euclidean distance

b. city-block distance

c. Chebychev’s distance

d. Manhattan distance

(a,

33. _____ is a clustering procedure characterized by the development of a tree-like structure.

a. Non-hierarchical clustering

b. Hierarchical clustering

c. Divisive clustering

d. Agglomerative clustering

(b,

38. _____ is a clustering procedure where all objects start out in one giant cluster. Clusters are formed by dividing this cluster into smaller and smaller clusters.

a. Non-hierarchical clustering

b. Hierarchical clustering

c. Divisive clustering

d. Agglomerative clustering

(c,

43. The _____ method uses information on all pairs of distances, not merely the minimum or maximum distances.

a. single linkage

b. medium linkage

c. complete linkage

d. average linkage

(d,

44. _____ is frequently referred to as k-means clustering.

a. Non-hierarchical clustering

b. Optimizing partitioning

c. Divisive clustering

d. Agglomerative clustering

(a,

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download