Project I :Cognitive Task Analysis
IE 486 Lab 3: Spring 2007
Cell Phone Survey Analysis & Recommendation
This handout contains the information you need to complete Lab 3 (due Thursday March 22, in class). It consists of the following parts: 1) analysis of the collected data, and 2) recommendation for cell phone usability. Survey data and questionnaire are available through the course webpage. Examples of SAS codes for analysis of data are given in the appendix of this document. Write-ups: report length should be no more than 5 pages, including any figures and tables (from abstract to results & discussions); conclusions should be no more than 3 pages; anything in addition can be put in appendix and referred to in the text; font size is no less than 11 pt and line space can be single-spaced.
1. Cell phone survey data
All data for the cell phone survey was collected using the structured questionnaire (available on the course webpage). Each group has data from 100 subjects including information about the subject description (gender, age, duration of experience on cell phone, etc.) and which cell phone (manufacturer, brand and model name) the subject uses.
2. Analysis of the collected data
Analyze the data using SAS/STAT program as follows:
1) Tabulate collected data in summary table for features, models and questionnaire responses as illustrated in table 1.
2) Calculate basic statistics, such as mean, standard deviation, and frequencies on each question according to the type of cell phone (manufacturer and model).
3) Perform Cronbach’s α tests for internal consistency (using paired questions).
4) Do a correlation study among items.
5) Perform ANOVA tests.
6) Perform multiple regression analysis.
Please draw statistical and practical inference from the survey results regarding the following:
1) Which generic features (across manufacturers and models) are liked most, disliked most, or having greater difficulties in usage?
2) Which manufacturer and which model of the manufacturer is preferred regarding to each feature by the survey customers?
3) Which generic features (within manufacturers) are liked most, disliked most, or having greater difficulties in usage?
Table 1. Summary table
| Features |Feature 1 |Feature 2 |Feature 3 |… |
|Manufacturer | | | | |
|Model | | | | |
|Manufacturer 1 |Model 1 |X | |X |X |
| |Model 2 | |X |X | |
| |Model 3 |X | |X | |
| |: |X |X |X |X |
|Manufacturer 2 |Model 1 |X |X |X |X |
|: |: |: |: |: |: |
Note: X is the number of responses corresponding to each cell;
Empty cell means that the subjects cannot answer the question regarding to that feature because the model of cell phone does not have such feature.
The procedures of analysis are explained in more detail in the following. The detailed SAS code and analysis examples can be found in Appendix 1 and 2.
Cronbach's Alpha Coefficient
Internal Consistency is the extent to which tests or procedures assess the same construct. We could estimate the error of tests by just two comparable measurements from each of many subjects. It is usually done by include pairs of questions in the questionnaire. Cronbach’s Alpha Coefficient can then be calculated to measure the internal consistency of the questionnaire. While using the SAS/STAT (proc corr) program to calculate Cronbach alpha coefficient, since there are more than one question pairs, you can put the scores from the corresponding question pairs together for each subject. For example, if (q1,q7) , (q2,q8) are the question pairs used, the layout of the data sheet looks like this:
Subject1 q1 q7
Subject1 q2 q8
….
Subject2 q1 q7
Subject2 q2 q8
….
Subject3 q1 q7
Subject3 q2 q8
…..
Pearson Correlation (SAS/STAT (proc corr))
Do a correlation study among items (features). If a pair of questions represents one item, the average value of this pair of question should be used to represent that item.
The most common measure of correlation is the Pearson Product Moment Correlation (called Pearson's correlation for short). When computed in a sample, the Pearson’s correlation coefficient is designated by the letter "r" and is sometimes called "Pearson's r." Pearson's correlation reflects the degree of linear relationship between two variables. It ranges from -1 to +1. A correlation of +1 means that there is a perfect positive linear relationship between variables. A correlation of -1 means that there is a perfect negative linear relationship between variables. A correlation of 0 means there is no linear relationship between the two variables. Pearson’s correlation coefficient can be calculated by
Multiple Regression Analysis (SAS/STAT (proc reg))
Use the same items (features) as Pearson correlation analysis for multiple regression analysis. Consider the overall satisfaction as a dependent variable and the other items as independent variables. Perform multiple regression analysis with ‘stepwise’ option, which is used to select relevant independent variables on that dependent variable. We assume that the first-order (linear) regression model is fitted and that all the assumptions for regression model are satisfied.
Multiple regression is performed when several independent variables are used to predict the value of one dependent variable. In stepwise regression – we enter all the variables in at once and let the computer calculate and select the optimal independent variables that will lead to the greatest R-square. R-square represents the proportion of variance of the dependant variable that are accounted for by the independent variables together. The relationship between the dependent variable (DV) and p independent variables (IV) can be expressed by the following formula.
DV = Intercept + Coefficient(1)* IV(1) + Coefficient(2)* IV(2)+…
+ Coefficient(p)*IV(p)+ Error
ANOVA and LS means analysis (SAS/STAT (proc glm))
Table 2 below shows the layout of the data your group will collect. For each type (manufacture and/or model) of cell phone and features (items), you may have different number of data. We call it ‘the unbalanced design’ of experiment. Run ANOVA (analysis of variance) under the unbalanced design to find 1) preferred manufacturer, 2) preferred feature (item) across models of manufacturers and 3) preferred feature (item) within a manufacturer. Is there any significant difference (( =0.05) among the types of cell phone and features? To answer this question, use ‘Type III SS’ in ANOVA. If there is main effect of the types of cell phone and features, use ‘LS (least square) means’ analysis to determine where those differences are. We assume that all the assumptions for ANOVA are satisfied.
Table 2. Layout of data for ANOVA
|Type of cell phone |item 1 |item 2 |item 3 |item 4 |item 5 |… |item 20 |
|manufacturer 1 |model 1 |xxx |xxxx |
|Gender |1 (Male) |52 |52% |- |
| |2 (Female) |48 |48% |- |
|Experience |- |- |24.5 (12.34) |
|Type of cell phone |1 |11(M1) |12 |12% | |
| |(Motorola) | | | | |
| | |12(M2) |15 |15% | |
| | |: | | | |
| |2 |21(N1) |13 |13% | |
| |(Nokia) | | | | |
| | | | | | |
| |: | | | | |
| | |22(N2) |20 |20% | |
| | |: | | | |
- For each question, the frequency of each scale point (1 to 7), mean and standard deviation per each manufacturer and model should be reported. Using the following table, you can see the distribution of frequency over scale points (1 to 7) per each manufacturer and model.
Table for question 1
|Manufacturer |Model |Frequency of each Scale point|Sum of |Mean (SD) |
| | | |Frequency | |
| | |
Table for question 2
Table for question 3
:
Table for question 22
3. Internal consistency (Cronbach’s alpha)
With the paired questions, you can calculate Cronbach’s alpha coefficient as a measure of Internal Consistency by using the SAS code in Appendix 1. After running the SAS code, you can see the output like the following:
Cronbach Coefficient Alpha
Variables Alpha
-------------------------------------
Raw 0.891129
Standardized 0.891865
There are two kinds of Cronbach’s alpha coefficient, raw and standardized. The raw Cronbach’s alpha coefficient is computed based on the covariance matrix of the data, whereas the standardized Cronbach’s alpha coefficient based on the correlation matrix of data. Usually, these two coefficients are not very different, but the standardized score is preferred when the variances of variables are quite big.
You can report both of coefficients and interpret them with 0.7 as the critical value to determine whether the questionnaire has acceptable internal consistency.
Note: until now, you use all the questions in the questionnaire to summarize the data and the paired questions to get Cronbach’s alpha coefficient. But, for correlation study, multiple regression analysis and ANOVA, you should use the features (= items) as the variables, which are obtained by averaging the values of the paired questions, instead of raw results from the structured questions. For example, if you have 22 questions, including 2 paired questions, then you are really measuring 20 features (= 20 items). So, by averaging the values of the paired questions, you can get the values of 20 items.
4. Correlation analysis
Using the SAS code under “3. Pearson correlation” in Appendix 1 with data of the variables of features (= items), you can get the output like the following:
Pearson Correlation Coefficients, N = 100
Prob > |r| under H0: Rho=0
t1 t2 t3 ...
t1 1.00000 -0.86258 -0.26656 ( correlation coefficient (r)
0.0028 0.4881 ( p-value for the t-test
t2 -0.86258 1.00000 -0.18394
0.0028 0.6357
t3 -0.26656 -0.18394 1.00000
0.4881 0.6357
:
Each cell of the correlation matrix has two scores. The first score is the value of correlation coefficient (r), and the second score is p-value. The value of correlation coefficient (r) exists between -1 and +1, and p-value between 0 and 1. As the value of correlation coefficient (r) is near +1 or -1, the linear relationship between two features (= items) is strong in the positive direction or the negative direction. When p-value is less than 0.05, we can say that the linear relationship between two features (= items) exist. Thus, you should report the correlation matrix and interpret r-values and p-values to find which features ( = items) have linear relationship.
If you use the whole data of features, then you can get the relationship among features across the types of cell phone. But if you only use the data of features within a manufacturer, then you can get the relationship among features within a manufacturer.
5. ANOVA (with unbalanced design)
Using the SAS code under “5. ANOVA & LS means Analysis” in Appendix 1, you can get the following output. In ANOVA table, we can conclude whether there are differences among manufacturer and among feature (item), and whether there is the interaction effect between manufacturer and feature (item). In LS means analysis, we can see where the differences exist.
In this example, we consider only ‘manufacturer’ and ‘feature (item)’ as two main factors, but you can also consider ‘cell phone model’ and ‘feature (item)’ as two main factors within a manufacture if you only use the data within a manufacturer.
i) ANOVA table
Dependent Variable: score
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 5 47.97058824 9.59411765 12.42 0.0003
Error 11 8.50000000 0.77272727
Corrected Total 16 56.47058824
R-Square Coeff Var Root MSE score Mean
0.849479 22.99051 0.879049 3.823529
Source DF Type I SS Mean Square F Value Pr > F
manu 1 0.59558824 0.59558824 0.77 0.3988
item 2 11.16071429 5.58035714 7.22 0.0099
manu*item 2 36.21428571 18.10714286 23.43 0.0001
Source DF Type III SS Mean Square F Value Pr > F
manu 1 1.55128205 1.55128205 2.01 0.1842
item 2 12.59523810 6.29761905 8.15 0.0067
manu*item 2 36.21428571 18.10714286 23.43 0.0001
Because the number of responses may be different among variables (we call it the unbalanced design), you should interpret the results using ‘Type III SS’ instead of ‘Type I SS’. When p-value is less than 0.05, we can say that the factor is significant (i.e., there is significant difference among the levels of the factor).
In this example, there is significant difference among features (items), and significant interaction effect between manufacturer and feature (item). Because there is significant difference among features (items), and significant interaction effect between manufacturer and feature (item), we need to perform LS means analysis to find where the differences exist.
ii) Results of LS means analysis
The results of LS means analysis show the least square (LS) mean of each level of factors and the p-values to test the difference between the LS means of each level of factors. The LS means is the adjusted means that reflect the adjustment effects resulting from the unbalanced design.
(1) LS means analysis for features (items)
In the following example, there is significant difference between feature (item) 1 and 2, because the p-value of pairwise comparison between feature (item) 1 and 2 is 0.0114, which is less than 0.05. And, there is significant difference between feature (item) 2 and 3, because the p-value of pairwise comparison between feature (item) 2 and 3 is 0.0114, which is less than 0.05. Therefore, feature 1 and 3 are better than feature 2 (LS mean of feature 1 and 3 are 4.3333, which is higher than feature 2).
Least Squares Means
Adjustment for Multiple Comparisons: Tukey-Kramer
item score LSMEAN Number
1 4.33333333 1 ( LS means for items (features)
2 2.41666667 2
3 4.33333333 3
Least Squares Means for effect item
Pr > |t| for H0: LSMean(i)=LSMean(j)
Dependent Variable: score
i/j 1 2 3
1 0.0114 1.0000 ( p-value to test the difference
2 0.0114 0.0114 between items (features)
3 1.0000 0.0114
(2) LS means analysis for the interaction between manufacturers and features (items)
In the following example, within manufacturer 1, there is significant difference between feature (item) 1 and 2 because the p-value of pairwise comparison between feature (item) 1 and 2 is 0.0016, which is less than 0.05. And, within manufacturer 1, there is significant difference between feature (item) 1 and 3 because the p-value of pairwise comparison between feature (item) 1 and 3 is 0.0070, which is less than 0.05. Therefore, when we only consider manufacturer 1, feature 1 is the best (LS mean of feature 1 is 6.0000, which is higher than others).
Least Squares Means
Adjustment for Multiple Comparisons: Tukey-Kramer
manu item score LSMEAN Number
1 1 6.00000000 1 ( LS means for each combination of
1 2 1.50000000 2 manufacturer and items (features)
1 3 2.66666667 3
2 1 2.66666667 4
2 2 3.33333333 5
2 3 6.00000000 6
Least Squares Means for effect manu*item
Pr > |t| for H0: LSMean(i)=LSMean(j)
Dependent Variable: score
i/j 1 2 3 4 5 6
1 0.0016 0.0070 0.0070 0.0306 1.0000
2 0.0016 0.6972 0.6972 0.2766 0.0016
3 0.0070 0.6972 1.0000 0.9307 0.0070
4 0.0070 0.6972 1.0000 0.9307 0.0070
5 0.0306 0.2766 0.9307 0.9307 0.0306
6 1.0000 0.0016 0.0070 0.0070 0.0306
( p-value to test the difference between the
combinations of manufacturer and items (features)
6. Multiple regression
Using the SAS code under “4. Multiple regression” in Appendix 1 with data of the variables of features (= items), you can get the output like the following:
Dependent Variable: t2
Stepwise Selection: Step 1
Variable t8 Entered: R-Square = 0.5250 and C(p) = 5.7255
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Pr > F
Model 1 22.95858 22.95858 14.37 0.0022
Error 13 20.77475 1.59806
Corrected Total 14 43.73333
Parameter Standard
Variable Estimate Error Type II SS F Value Pr > F
Intercept 8.40347 1.19828 78.59439 49.18 F
Model 4 35.94324 8.98581 11.53 0.0009
Error 10 7.79010 0.77901
Corrected Total 14 43.73333
Parameter Standard
Variable Estimate Error Type II SS F Value Pr > F
Intercept 10.86043 2.74191 12.22167 15.69 0.0027
t4 0.37432 0.20881 2.50343 3.21 0.1033
t6 -0.34174 0.19548 2.38081 3.06 0.1110
t8 -0.67307 0.22559 6.93454 8.90 0.0137
t10 -0.82064 0.25959 7.78511 9.99 0.0101
All variables left in the model are significant at the 0.1500 level.
No other variable met the 0.1500 significance level for entry into the model.
Summary of Stepwise Selection
Variable Variable Number Partial Model
Step Entered Removed Vars In R-Square R-Square C(p) F Value Pr > F
1 t8 1 0.5250 0.5250 5.7255 14.37 0.0022
2 t10 2 0.1797 0.7046 1.3994 7.30 0.0192
3 t4 3 0.0628 0.7674 1.1885 2.97 0.1128
4 t6 4 0.0544 0.8219 1.2717 3.06 0.1110
Because of the stepwise option, we can get several steps of multiple regression results and a summary of stepwise selection. In this example, we use item 2 (t2) as the dependent variable. In the first step, item 8 is entered as the most important independent variable in the multiple regression procedure; in the second step, the second important independent variable is entered; and so on. Through these steps, the SAS program automatically selects the important independent variables and makes the summary of stepwise selection.
In the last step, you can get the estimates of intercept and slopes of the independent variables in the parameter estimate column.
In the summary of stepwise selection, you can interpret the results with partial R-square and p-value. Partial R-square shows how much each independent variable explains the variation of dependent variable. P-value shows whether each independent variable is significant. Usually we use the value of 0.05 to determine the significance of each independent variable, (i.e., the independent variable is significant, if p-value is less than 0.05). In this example, t8 and t10 are significant features at the significance level of 0.05, because their p-values are less than 0.05.
If you use the whole data of features, then you can get the effects of the features on general satisfaction across the types of cell phone. But if you use only the data of features within a manufacturer, then you can get the effects of the features on general satisfaction within a manufacturer.
Additional hints: 1. Determination of the rank order of items/variables that are significantly different is dependent on appropriate choice of analysis: include a regular post-hoc test such as SNK test or Duncan test if the interaction for item*manufacturer is not significant. 2. Pre-test your data to see if it meets the assumptions of the test: homogeneity of variance and assumptions of parametric statistics such as normality of distributions. 3. Consider a factor analysis to justify grouping multiple highly correlated features/items into a measure (before your test of internal consistency). 4. You may also consider drafting a tree-branch chart to show the likely necessary analyses as an if-then type of consideration (include normality, homogeneity of variance, factor analyses, anovas, posthoc tests – with and without interactions, correlations and regressions (stepwise and final form – also consider discussing the ‘error’ term, coefficients and consider revising/rerun in the final regression model/equation (rerun w/out stepwise), choosing only those variables included at p ................
................
In order to avoid copyright disputes, this page is only a partial summary.
To fulfill the demand for quickly locating and searching documents.
It is intelligent file search solution for home and business.
Related download
- using the ti 83 to construct a discrete probability
- casio calculator fx 83es
- describing and interpreting data
- the practice of statistics ap statistics ap statistics
- frequency distributions brainly
- describing data
- normal curve percentages
- project i cognitive task analysis
- ap statistics midterm exam hcps blogs
Related searches
- financial statement analysis project example
- project i am international
- cognitive task definition
- sternberg s cognitive analysis of intelligence
- pre task analysis construction
- task analysis examples
- task analysis examples in education
- simple task analysis examples
- job task analysis form
- task analysis examples for classroom
- job task analysis shrm
- task analysis examples in business