Summary: Which test should I use?



Summary: Which test should I use?Review of Different OptionsThe centerpiece of the second half of the course is the basic paradigm of statistical inference, which consists of these components:Examine the data and your questions: what are the explanatory and response variables?Choose an appropriate statistical analysis: how can we convert the data and question to a test statistic and P-value?Draw conclusions: what do the null and significant results mean, respectively?The second step is the most important part of statistical inference, and it is what you need to spend the most time reviewing before Test #3. In fact, one part of the exam simply asks you to decide which analysis is appropriate for each given scenario without doing any analysis.By now, we have seen all the major distributions that were used in statistical inference involving two or more samples (two sample t-test, ANOVA) or two variables (correlation and ) . It might be a good time to time to take a look at all of them together (this chart is included in Test 3 study guide):Two-sample proportionsTwo independent sample meansDifference of two dependent samples (matched pairs)Analysis of Variance (ANOVA)Correlation testGoodness-of-Fit testIndependence test As you probably noticed already, once you identify how the P-value or critical value were based on the individual distributions, the rest of the hypothesis testing was pretty much the same! So for most of the problem solving, you should really just focus on reading the problem carefully (which includes your own project), and carefully consider which analyses can be applied to what type of data. (sometimes there is more than one choice!)Deciding Which Test to ApplyTo help organize the topics covered in Chapter 10 -- 13, the following chart can be a helpful tool:Response/Dependent VariablesCategoricalQuantitativeExplanatory/IndependentCategoricalC --> CExample: two proportions, independence testC -> QTwo independent sample means, difference of matched pairs, ANOVAQuantitativeQ -> QCorrelationHere are some scenarios based on the examples throughout this course: On average, are male students taller than female students? Explanatory variable: Gender (categorical)Response variable: Height (quantitative)Appropriate analysis: two independent sample means (since the male and female students were not matched in our data) On average, are students scoring higher in the Test #1 than Test #1? Explanatory variable: test (categorical)Response variable: score (quantitative)Appropriate analysis: difference of two dependent samples (since for each student, the scores are matched as a pair) In the C->Q scenarios (#1 and #2), the two samples can either be independent and dependent. Since they are based on different hypotheses and different ways to calculate the t-statistic, you will need to carefully consider the design of the study: if the samples were deliberately matched, then the proper analysis is to treat them as dependent and find the difference of each pair; Otherwise, you will need to treat them as independent and compare the two means. Is there any difference in the age of students who have different attitudes towards reading newspaper? Explanatory variable: Paper (used as categorical)Response variable: Age (quantitative)Appropriate analysis: ANOVA (since there are 5 categories in the Paper variable)Do more male students prefer Blue as their favorite color than female students?Explanatory variable: Gender (categorical)Response variable: Favorite Color (binary: Blue or Not Blue)Appropriate analysis: two sample proportions (since we are explicit testing whether one proportion is higher) Do taller students usually wear larger shoes?Explanatory variable: Height (quantitative)Response variable: Shoe Size (quantitative)Appropriate analysis: Correlation Does gender affect the color preferences of male and female students? Explanatory variable: Gender (categorical)Response variable: Favorite Color (categorical, many colors)Appropriate analysis: independence test The only test missing from this list of examples is Goodness-of-Fit, because it only involves counting the frequencies based on one variable. Supposed we have the same type of ordinal data (on the scale of 1-5) from the attitudes towards Mathematics from the general public, we could in principle test the following hypothesis using the Goodness-of-Fit test: Does the attitude towards Mathematics among Math 15 students differ from the attitude of the general public? Categorical Variable: MathExpected Distribution: Attitudes towards Math from the general publicAppropriate analysis: Goodness-of-Fit. Before you decide which test should be applied to your project, you should examine how your data is recorded (an exercise that I asked you to incorporate in your prelim project). For example, suppose you are looking at whether AGE has an effect on amount of sleep people get per day. If you ask people to select their age from a list of choices (10-19, 20-29, etc.), you are coding AGE as a categorical variable, and your tests are listed to either a t-test for two-sample means (if you are comparing two groups), or ANOVA (if you have multiple groups). However, if you ask people to enter their exact age in the survey, then you will be able to perform correlation, since now both AGE and SLEEP are coded as quantitative variables.By identifying the nature of the independent / explanatory variable, and what you are trying to explain (response variable), you can fairly quickly narrow your choices. It's important to recognize that although people often use the term "correlation" loosely to refer to all of the various combinations listed above, correlation analysis only applies when you have two quantitative variables.Additional ResourcesThis video also includes some additional tips that might be helpful (it didn't include ANOVA, however): those of you who may be taking more statistics in the future, it might be a good eye-opener to look at a full range of statistical analyses used in various disciplines, as well as some guidelines on which ones to use: best review exercise that I can recommend for Test #3, is simply looking at a data set and some relevant questions, and asking yourself how you would analyze the data. If you have data already available, I would recommend that you start there. In addition, the Heart Study data I posted in the forum, as well as Test #3 review are also useful for helping you clarify your thinking. If you have any questions about Test #3, please post them in the forum and we'll use them for general discussions before the exam. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download