Testing group differences with regression analysis ...



Group Differences in Prediction in Aging Research:

Statistical, Methodological, and Conceptual Issues

Jason T. Newsom

Holly G. Prigerson

Richard Schulz

Charles F. Reynolds III

Draft: 7/2/01

Jason T. Newsom, Institution on Aging, Portland State University; Richard Schulz, Department of Psychiatry and University Center for Social and Urban Research, University of Pittsburgh; Holly G. Prigerson, Yale University School of Medicine, and Charles F. Reynolds III, Department of Psychiatry, University of Pittsburgh School of Medicine.

Address correspondence to Jason T. Newsom, Institute on Aging, Portland State University, P.O. Box 751, Portland, OR 97207-0751, newsomj@pdx.edu.

We thank Steve West, Scott Beach, and Beth Green for comments on an earlier draft of this manuscript. Support for this paper was provided in part by MH52247, MH37869, MH00295, MH01100, and MH46014 from the National Institute of Mental Health and AG13305 and AG15159 from the National Institute on Aging.

Abstract

This article discusses the methodological and statistical issues relevant to the two common approaches to analyzing group differences in prediction in gerontology research. Two gerontological studies, one focusing on caregiving and one focusing on sleep and depression, are used to contrast the two analytic approaches and illustrate interpretive and statistical problems with some currently published work. One method, the subgroup regression approach, which involves splitting the study sample into two or more groups and conducting separate regression analyses is found to have several important limitations. The moderated regression approach, which involves testing the significance of a product term, is described and recommended as a solution to the problems with the subgroup approach. It is argued that all hypotheses about group differences can be conceptualized as interaction hypotheses and that the most appropriate method of testing these hypotheses is the moderated regression approach.

Gerontological researchers often ask questions about whether certain variables differentially predict an outcome in two groups. Some examples of such hypotheses might include the following: "Does perceived control predict well-being for those with osteoarthritis but not for controls?”, "Are the risk factors for depression the same for Blacks and Whites?", "Is health a more important predictor of life satisfaction for those under 65 than for those over 65?", "Does social support have a stronger relationship to depression for caregivers versus noncaregivers?". Although it is common to think of these types of hypotheses as pertaining to group differences, each posits an interaction or “moderator” effect (Saunders, 1955, 1956). Baron and Kenny (1986) define a moderator variable as one that “affects the direction or strength of the relation between an independent or predictor variable and a dependent or criterion variable” (p. 1174). In the osteoarthritis hypothesis, for example, arthritis group variable moderates the effects of perceived control on well-being, because the effects of perceived control depends on whether participants have arthritis or not.

Hypotheses that take this general form are analyzed by researchers in a number of different ways. One of the more common analytic approaches to these hypotheses is Analysis of Variance (ANOVA). ANOVA is only appropriate, however, when both independent variables are naturally categorical (e.g., gender or marital status) or manipulated experimentally (e.g., interference vs. non-interference conditions in a memory experiment). Many times, researchers modify naturally continuous independent variables by splitting them at the median or other cutoff points in order to use this more familiar analytic approach. Although this approach may be more convenient, there are a number of serious difficulties with ANOVA under these circumstances including important losses of power and the possibility of overlooking curvilinear effects. Because other authors have focused specifically on the appropriate circumstances for applying regression versus ANOVA (West, Aiken, & Krull, 1996), however, these issues are not the primary focus of the present article.

A second approach to testing hypotheses about group differences in prediction, which we refer to as “subgroup regression analysis,” is also commonly applied in the gerontology literature but has received little or no attention from gerontologists or methodologists (for an exception, see Hardy, 1993). In the subgroup regression approach, separate regression analyses are conducted for each group using the continuous independent variable as a predictor. Researchers are often interested in the risk factors that may be different for subgroups in the study, but these hypotheses can be conceptualized and analyzed in terms of moderator or interaction effects.

The purpose of the present article is to consider the statistical, methodological, and conceptual issues involved in studying group differences in prediction. Although a number of authors have discussed issues relevant to moderator effects, to date, no authors have examined issues related to the commonly used subgroup regression approach. Moreover, a review of published articles in gerontology suggest that recommendations by statisticians regarding tests of interactions are only sometimes adopted by researchers publishing in gerontological journals.

The Prevalence of the Subgroup Regression Approach

The subgroup regression approach is not exclusively applied in gerontology research, but the approach seems especially common in our field. One reason for its popularity in our discipline is the focus on health conditions, age, and chronic stressors (e.g., caregiving) which are often hypothesized to moderate the effects of other variables. To investigate how frequently the subgroup regression approach is used, we reviewed articles in several prominent journals in gerontology. Of a total of 416 empirical articles published in 1995 and 1997 in The Gerontologist, Journals of Gerontology: Series B (Psychological and Social Sciences), Psychology and Aging, and International Journal of Aging and Human Development, 11.7% (49) used regression analysis to investigate moderator hypotheses of one form or another.[i] Of these articles, 57.1% (28) tested multiplicative interaction terms between continuous variables,[ii] and 42.8% (21) of authors split their sample into two or more groups and conducted separate regressions. It is clear from this review that the subgroup approach is commonly used in gerontological research and that a careful consideration of the potential problems is warranted.

Research Examples

To illustrate, we will draw on two research examples. One example is based on a study of gender differences in predictors of depression in widowed elders. Fifty-four spousally-bereaved participants with a mean age of 68 were recruited through a local psychiatric hospital for participation in a study of physiological aspects of sleep, depression, and bereavement (Beery et al., 1997; Prigerson, et al., 1998). Eighteen of the participants were male and 36 were female. All participants were given a battery of psychological measures including depression, anxiety, medical burden, activity level, sleep quality, and social support. Our examples, however, will only involve a subset of these measures: anxiety (Brief Symptom Inventory-Anxiety subscale: BSI; Derogatis, 1983), Activity Level Index (ALI: Monk, Flaherty, Frank, Hoskinson, & Kupfer, 1990), sleep quality (Pittsburgh Sleep Quality Index: PSQI; Buysee, Reynolds, Monk, Berman, & Kupfer, 1989) and depression (Hamilton Rating Scale for Depression; Hamilton, 1960).

The second research example involves data from the National Caregiver Health Effects Study (CHES; Schulz, Newsom, Mittelmark, Burton, Hirsch, Jackson, 1997), an ancillary study of participants in the Cardiovascular Health Study (CHS; Fried, et al., 1990). The CHES is a multi-site study examining psychiatric and physical health effects of caregiving. Eight hundred nineteen married couples, average age of 77, were drawn from the larger population-based CHS in four states in the US. Participants completed a broad range of measures, such as depression, service utilization, spouse’s need for assistance, and amount of assistance provided. In this article, we will focus on a subset of these variables, including the amount of assistance caregivers provided with activities of daily living, average amount of emotional strain experienced with providing that assistance, the number of hours spent providing care (Gilhooly, 1984), neuroticism (Eysenck, 1956), and caregivers’ depression symptoms as measured by the Center for Epidemiologic Studies Depression scale (CES-D: Radloff, 1977).

Before beginning, it is important to note that our examples and discussions operate under several assumptions. First, although interaction tests with regression can incorporate curvilinear effects, our discussions and examples are restricted to linear relationships. Second, we assume that categorical variables are truly dichotomous and do not represent an underlying continuous variable. Third, for simplicity, we limit discussion to categorical variables with two-levels (i.e., the two-group case) and interactions involving only two variables, although the concepts and statistical issues are applicable to studies with more than two groups.

Subgroup Approach

Researchers frequently examine separate regression equations for two subgroups of their sample. For instance, a researcher might be interested in how traditional predictors are related to depression for women and men. Indeed, this method is commonly used by gerontology researchers who have questions about group differences. Despite its widespread use, the subgroup approach is rarely the optimal method of testing research questions. Most often this approach provides less information and is less powerful than the moderated regression approach, and it also may lead to serious statistical or interpretation problems.

Sampling Variability

The most important limitation to the subgroup approach is often overlooked by researchers. It is common practice to run separate regressions in each group and look for differences in the size or significance of the coefficients. It is extremely rare, however, that between-group differences in coefficients are tested for significance. Even if a given predictor is significant in one group and nonsignificant in the other, the coefficients may not differ significantly between groups. An observed difference between the size of the coefficients is not reliable without taking sampling error into account. In other words, a single predictor may differ in size and/or significance between two groups simply as a result of chance.

In a more realistic illustration, we examined the hypothesis that activity level (ALI) would be a stronger predictor of depressive symptoms for women than for men. We began by regressing Hamilton Rating Scale for Depression (HRS) scores on activity level separately for women and men. We also included age, education, sleep quality (PSQI), and anxiety as predictors. Thus, we were interested in how activity level predicted non-anxiety depression symptoms six months after the loss of a spouse. As can be seen in Table 1, activity was a significant predictor of depressive symptoms for females (b = -.114, t =2.00, p = .05). This relation did not hold for males, however (b= -.071, t < 1, ns). Given this outcome, one might assume that activitiy level is a stronger predictor of depression in women than for men, but, as we will see below, this is not necessarily the case. This analysis provides no information about whether the activity-depression relationship might differ simply due to sampling error. In other words, we do not know if b = -.114 is significantly different from b = -.071.

Confounds with Group Membership

When regression analyses are compared in two different groups, variables associated with group membership are not adequately controlled. Differences between groups may not be a function of group membership, but some variable which is partially or fully confounded with the grouping variable. Even if confound variables are included in the separate regression analyses, only their variation within each group is controlled. Thus, researchers may conclude that the difference is a function of group membership when, in fact, they are actually due to a variable correlated with group membership. A simple example would be concluding that differences exist between two racial groups without controlling for differences in income.

Figure 1 illustrates the general point. In the Figure, the variable Z is completely confounded with group membership (distinguished by filled and unfilled circles). Within each group, there is no relationship between the confound variable, Z, and the outcome, Y (depicted by the solid lines). However, there appears to be a strong relationship between Z and Y when both groups are included as depicted by the dotted line. This example illustrates that regardless of the relationship between a covariate and the outcome variable, an entirely different relationship may exist between the covariate and the outcome variable when looking across the groups.

As an illustration, we used the CHES data to examine gender differences in the relationship between the amount of assistance the caregiver provides to his or her spouse and the caregiver's depression level. Separate regressions for male (n = 389) and female (n = 409) caregivers predicting CES-D scores were conducted. To control for personality differences in caregiver's reports of the amount of assistance provided, we also included neuroticism in each model. Results suggested that the relationship between caregiving assistance and depression was greater for females (b = .174, t = 3.793, p 01), the interaction is considerably reduced and no longer reaches conventional significance levels when neuroticism is included as a covariate (b = -.111, t = 1.865, p =.06). The apparent gender difference in the relationship between the amount of assistance and depression was due to a partial confounding of neuroticism with gender. Only after controlling for neuroticism across groups, however, was it possible to discover the confounding of gender and neuroticism.

Dichotomization

In many cases the subgroup approach involves a comparison of groups that are artificially created. It is customary, for instance, to split a sample at the median or a theoretical cutpoint of a measure. This type of sample splitting results in a substantial loss of information as well as an important loss of power, and, therefore, researchers should avoid artificial groupings in the analyses.

A number of authors have pointed out the loss of power that stems from artificially dichotomizing variables in the context of correlation analysis (Bissonnette, Ickes, Bernstein, & Knowles, 1990a, 1990b; Bollen & Barb, 1981; Cohen & Cohen, 1983, 1990; Humphreys 1978; Humphreys & Fleishman, 1974; Maxwell & Delaney, 1993; Maxwell, Delaney, & Dill, 1984; McNemar, 1969; Peters & Van Voorhis, 1940). Assuming normally distributed variables initially, splitting one of the variables at the median (or mean) will result in an r2 that is 64% of the value that would be obtained if both variables remain continuous (Cohen, 1983; Peters & Van Voorhis, 1940). For example, an original r2 of .4 will be reduced to .26. If both variables are dichotomized, the original r2 value will be reduced to approximately 40% of its original value. If one creates unequal groups, by using a cutpoint other than the median (e.g., using the mean in a non-normal sample or a theoretical cutpoint which does not fall on the median), the reduction in r2 is greater. These reductions reflect the important loss of information when all subjects on each side of the cutpoint are treated as equivalent.

The CHES caregiver study provides a good example of this phenomenon. Caregiving is often defined as an all-or-none condition, in which those who provide care are treated equivalently in analyses. Caregivers, for example, may be compared to non-caregivers on depression scores (e.g., Russo, Vitaliano, Brewer, Katon, & Becker, 1995). Schulz et al. (1997) have argued that caregiving should not be considered a simple dichotomy, because the amount of care provided and the amount of emotional strain reported varies greatly across individuals. To illustrate this point, we computed the correlation between the number of hours per day that participants helped their spouse with daily activities (i.e., IADL or ADL difficulties) and depression (CES-D scores). The correlation between hours helping and depression was .24 (r2 = .058) when both measures were considered to be continuous. When both measures were dichotomized (CES-D score of 16 or greater and caregiving divided into non-helping and helping groups), the correlation between the measures was .16 (r2 = .025). The percentage of variance accounted for between the dichotomized variables was approximately 43% of that accounted for when the variables were treated as continuous.

In univariate analyses, dichotomization represents a threat to power, but when such variables are used as covariates in multivariate analyses, an overestimate of the strength of predictors and spurious significant effects may occur (Maxwell & Delaney, 1993). Because a dichotomized variable will be a less precise measure of its underlying construct, it will not serve as an adequate control. Provided the covariate is also correlated with the dependent variable, predictors with which the covariate is correlated will have inflated estimates and higher rates of Type I error. This phenomenon occurs to a greater extent as the number of subjects increase and the correlation between the dichotomized covariate and the predictor of interest increases. These problems suggest that artificial dichotomization leads to lower power and potential problems when searching for group differences.

In the context of the subgroup approach, splitting the sample is also likely to lead to a considerable loss of power due to the reduced sample size in each subgroup. To illustrate, we estimated power to detect significance in a series of hypothetical tests when a sample is split in half (Borenstein & Cohen, 1988). Each estimate was based on the significance test of a single slope when six independent variables were in the equation (i.e., five covariates). Assuming an increase in R2 of .1 (i.e., a standardized regression coefficient of approximately .33), the decrease in power when an original sample of 100 is split into two equal groups ranges from 8% to 29%. This range depends on the overall amount of variance accounted for by the model.[iii] Smaller samples can result in even more dramatic decreases in power. Assuming a total sample size of 50 and a cumulative R2 of .2, for instance, power will be reduced by 50%. When a standardized cutpoint is used or natural groups exist, sample sizes in the subgroups will often be unequal. In such cases, power also will be unequal in the two groups and differences in significance in the two groups may reflect sample size differences as well as effect size differences.

Comparing Groups of Unequal Sample Size

Frequently the sample is divided into groups based on a theoretical cutpoint, such as clinical depression. Such cutpoints most often lead to unequal sample sizes, particularly among community populations. If using the subgroup approach, conclusions about group differences based on differences in significance or “significance levels” may be erroneous unless sample size is also taken into account. Because tests of coefficients will be more powerful in the larger group, one is more likely to find significant effects at higher “significance levels” in the larger group.

Sample size also effects the stability of the coefficients, and the stability of the coefficients will be unequal if the sample sizes in the two groups are unequal. Regression coefficients are consistent, meaning that sample slopes are better estimates of the slope in the population if the sample is large. Conversely, regression slopes derived from smaller samples will be poorer estimates of the population slope. Because the subgroup method will often involve subsamples of different sizes, regression slopes in the two subsamples will differ in how well they represent the slopes in the population. In a sense, the relationship between a particular predictor and an outcome will be more “generalizable” to the population in the larger subsample than in the smaller subsample.

This difference in generalizability can also be characterized by the degree to which the coefficients are expected to fluctuate from sample to sample. The gender and activity example, in which there are twice as many females as males, illustrates how sample size plays a part in the sampling variability of the regression coefficient. As can be seen in Table 1, the standard error of the coefficient (i.e., SEb), which estimates sampling variability, is lower in the female group for every predictor. Similarly, differences in variability are reflected in the confidence intervals, in which there are smaller confidence intervals for all predictors in the female group.

Predictor Intercorrelation

There are additional difficulties in interpreting subgroup results when more than one predictor is included in the analysis. The value of a particular regression coefficient depends, in part, on the relation of the predictor to other predictors (i.e., covariates) in the equation and their relation to the dependent variable. A predictor variable may have an equivalent association with the dependent variable in the two groups, but if the relation between a covariate and the dependent variable differs in the two groups, the predictor variable may appear to be different in the two groups. In such cases, group differences in the covariate are actually responsible for the apparent group differences in the predictor variable.

Other illusory relations may also occur. Although two variables may predict the dependent variable equally in the two subgroups, the correlation between the two predictors may differ in the two subgroups. In this case, the intercorrelation between the predictors may produce apparent group differences in one or both of the regression coefficients. Especially when analyses involve more than two predictors, interpreting group differences with the subgroup approach is quite complex because of the value of a predictor is dependent on its association with all other predictors and their associations with the criterion.

Another problem results from the interdependence of regression coefficients when the subgroup method is used. Separate regression analyses involve within-group, predictor intercorrelations that reflect both sampling error and true relations in the population. When a study sample is divided into two groups, the pattern of correlations among predictors will differ somewhat as a result of sampling error. The sampling error in the predictor intercorrelations that results from creating two subgroups will add sampling error to the regression coefficients, because each coefficient depends on these intercorrelations. Similarly, subsample fluctuations in the correlations between predictors and the criterion will add sampling variability to any individual predictor. In general, with the subgroup method, these problems are only avoided to the extent that predictor intercorrelations consist of true differences, the predictor correlations are equal to zero, or the predictor-criterion relations are zero. In practice, this will rarely be the case for all the predictors in the equations.

Standardized Solutions

Standardized solutions are convenient because they provide a “common metric,” allowing rough comparisons of measures that have very different scales. The standardized solution is commonly used in conjunction with the subgroup approach to gauge the relative effects of a set of predictors within each subgroup. Between-group comparisons based on the standardized solution, however, are problematic whenever subsample variances are unequal (e.g., Kim & Mueller, 1976). Even with nonsignificant differences in variance for a particular variable, the influence of the within-group variances of a variable on the within-group standardized coefficients may lead to incorrect conclusions about the differences in the importance of that predictor in the two groups.

As demonstrated by Kim and Ferree (1981), the true relation between predictor and criterion may be identical in two groups, but if the variance of the predictor differs in the two groups, standardized coefficients will differ. Under certain conditions standardized and unstandardized solutions may lead to different conclusions when comparing the relative size of coefficients. This can be illustrated by examining the standardized and unstandardized coefficients for sleep quality (PSQI) in the bereavement sample (Table 1). If one compares the values of the unstandardized coefficients, PSQI appears to be an equal predictor of depression for males and females (b = .915 and b = .918, respectively). A glance at the standardized solution, however, suggests that sleep disturbance may be more strongly related to depression for females (β = .644) than males (β = .526). In other instances, the standardized and unstandardized coefficients may even lead to opposing conclusions about the relative strength of prediction in the subgroups.

In practice, it will rarely be the case that variables included in the analysis will all have identical variances in the subgroups. Because the variances of the variables are not typically taken into account when standardized solutions are evaluated, researchers may be misled when making inferences about whether standardized coefficients differ across groups.

In summary, the subgroup approach to exploring differential prediction is laden with a number of statistical and interpretational limitations of which many researches may be unaware. The limitations of the approach include: the tendency to neglect sampling error in group comparisons of regression coefficients, the inability to distinguish differences due to group membership and confounds with group membership, the loss of information that results from artificial dichotomization of continuous measures, the loss of power within groups due to sample size reduction, difficulty interpreting the effects of covariates on partial regression coefficients, and difficulty interpreting standardized solutions. These limitations suggest that the subgroup approach is not appropriate for answering questions about group differences. If the researcher’s goal is to make comparisons about the relative magnitude or significance of predictors across subgroups, the subgroup regression approach does not provide the supportive statistical tests for such comparisons.

Moderated Regression Approach: A Recommended Solution

All of the limitations of the subgroup approach to examining group differences which were noted previously can be overcome by explicitly testing interactions using moderated regression. This approach involves exploring interactions among two or more variables. A significant interaction suggests that the effect of one predictor on the criterion variable depends on the effect of a second predictor. A two-way interaction can involve two continuous variables, a continuous and a dichotomous variable, or two dichotomous variables. A significant interaction involving a continuous and a dichotomous variable, which we focus on here, suggests that the direction or magnitude of the relation between the continuous variable and the outcome is different in the two levels of the dichotomous variable. In other words, there is a significant group difference in the effect of the continuous predictor.

In the bereavement study, we might test the hypothesis that activity level is a more important predictor of depression for women than it is for men. The moderated regression approach tests whether the difference between the two regression slopes is due to sampling error. Recall that when males and females were examined separately (Table 1), activity was a significant predictor of depressive symptoms for females (b = -.114, t =2.00, p = .05), but did not reach significance for females (b= -.071, t < 1, ns). That analysis, however, did not provide any information on whether females and males differed significantly from each other.

The first step in moderated regression analysis is to “center” the continuous predictor used in the interaction test by subtracting the sample mean from the variable for each subject. In the bereavement example, the activity variable was centered by subtracting the mean for the entire sample (i.e., M = 82.596) from each individual’s score.[iv] Then, one should ensure that the dichotomous variable is coded as 0 and 1 (here, females = 0, and males = 1).[v] A product term variable is then computed by multiplying the two predictor variables together. The interaction is tested by running a regression model that includes the two predictor variables and the new product term.

Evaluating the interaction is only considered appropriate when the two “main effects” are included in the equation (Cronbach, 1987; Stone & Hollenbeck, 1984). This parallels the ANOVA approach that simultaneously tests the main effects and interaction (Cohen & Cohen, 1983, pp. 308-311). Although the “main effects” in moderated regression are similar to those in ANOVA, we follow others (e.g., Aiken & West, 1991; Darlington, 1990) in referring to them more generally as “first order effects.” As is the case with ANOVA, caution is typically recommended when interpreting the first order effects in the presence of an interaction.

“Centering” the variables has the effect of reducing multicollinearity between the first order terms and the product term (Marquardt & Snee, 1975; Marquardt, 1980; see also, Aiken & West, 1991; Cronbach, 1987). When the centering method is used, there will be no difference in the interaction coefficient from an analysis that does not center the first order variables (Aiken & West, 1991). The first order coefficients will differ, however. The difference in the first order effects for centered and uncentered approach may be quite dramatic, differing in sign and/or in significance (West et al., 1996). Although the scaling of the x-axis will be changed, the appearance of the plotted interaction will not differ for centered and uncentered solutions.

After centering the activity variable, a product term for activity and gender was calculated, and activity level, anxiety, sleep quality, age, education, and the interaction term were entered into the equation together. Results indicated there was no significant interaction between gender and activity level on depression (b = .063, t < 1, ns). Thus, the separate regression slopes for females and males (b = -.114 and b = -.071, respectively) were not significantly different from one another.

If the interaction had been present, significance tests of “simple slopes” and corresponding plots would have been conducted to probe the nature of the interaction. A simple slope involves the relation between one of the predictors and the criterion at a particular value of the other predictor. The slopes of one predictor variable plotted at particular values of the second predictor (or moderator variable) will be nonparallel if the interaction is significant. If the interaction involves a continuous predictor and a dichotomous predictor, as in the gender-activity example, one set of simple slopes examines the relation between the predictor and the dependent variable separately for each of the groups. For instance, had the interaction between gender and activity level been significant, the simple slopes would be the test of whether activity was a significant predictor for men and for women.

Choosing a Coding Scheme

When investigating the interaction of a continuous and a dichotomous variable, the continuous variable should always be centered, but there are three possible ways to code the dichotomous variable. The choice of coding scheme depends on the researcher’s hypotheses, nature of the comparison groups, their sample sizes, and the sampling method of the study.[vi] Results from the gender-anxiety interaction are found in Table 3, using each of the three coding options.

Dummy. The most commonly used approach is dummy coding. This method recodes the dichotomous predictor to 0 and 1.[vii] The group coded zero represents the base group for the comparison. Dummy coding is most useful when a clear comparison group exists. For example, if a group receiving cognitive therapy is compared to a waiting list control group. Here, the wait list control group most naturally serves as the basis of comparison and is coded 0. Notice that the interaction test is identical to that obtained with the weighted effects coding scheme, because dummy and weighted effects codes are linear transformations of one another. The coefficient for the continuous variable represents the effect when the dichotomous variable is equal to zero, whereas the coefficient for the dichotomous variable represents the effect at zero (the mean) of the continuous variable.

Weighted effects. The second coding scheme, weighted effects coding, involves proportional coding based on the number of cases in each of the groups. When there are only two groups, a simple method can be used. The group originally coded 1 with dummy coding remains the same. The code for the group originally coded 0 (i.e., the referent group) is computed by taking a ratio of the two sample sizes and multiplying by negative one ([pic], where n1 is the number of cases in the original 1 group and n0 is the number of cases in the original 0 group). For example, assume the original codes used are 0 for females and 1 for males. Because there are 34 females and 18 males, respectively, the code for the male group remains equal to 1, and the code for the female group is equal to negative one times 18 divided by 34, or -529 ([pic]). This coding scheme weights the groups proportionally so that the codes for the total sample sum to 0. [viii]

With the weighted effects coding method, the regression coefficient represents the deviation of the group mean from the weighted grand mean. As with unweighted effects coding, weighted effects coding is used when there is no clear comparison group, but, as West et al. (1996) argue, weighted effects coding is most appropriate when random sampling is used and the researcher wishes to generalize to groups in the population with sizes proportionate to those obtained in the sample. [ix]

Unweighted effects. The third method of coding, referred to as unweighted effects coding, uses -1 and +1, instead of 0 and 1. Each group is weighted equally with respect to sample size. When sample sizes are equal, results with unweighted and weighted effects coding are equivalent. Unweighted effects codes should be used when there is not clear comparison group, and, as West et al. (1996) argue, when the researcher assumes that the sample sizes in the population are approximately equal.

Comparing results from the three coding schemes. To illustrate the use of the various coding methods, we tested whether there was a significant interaction between anxiety and gender--whether the anxiety coefficient for males was significantly different from the anxiety coefficient for females. The full results of the test of the interaction, which involved age, education, sleep, and activity level as covariates are presented in Table 2. As observed in the table, the interaction between gender and anxiety was significant. A comparison of the dummy, weighted, and unweighted effects tests can be found in Table 3. In all cases, results indicated a significant gender by anxiety interaction. As observed in the table, however, the unstandardized interaction coefficients differ across the three methods. This difference is a reflection of the different meaning given the interaction by the choice of coding scheme when the group sample sizes are unequal.

An examination of the first order effects for anxiety also illustrate the differences in interpretation of the coefficients that result from the coding scheme used. Note that the first order effects for the dichotomous variable, gender, all have identical t-values (i.e., the significance tests are identical) regardless of the coding scheme used, but the t-values differ for the continuous variable. The first order effect for the dichotomous variable has different interpretations for the three different coding schemes. For dummy coding, the slope represents the difference between the two group means. For weighted and unweighted effects coding, the slope represents the average difference of the groups from the grand mean when the grand mean is calculated by weighting or not weighting by sample size, respectively.

The continuous predictor, anxiety, differs across the three coding schemes because of the differences in the interaction terms and the dichotomous predictor. Anxiety is partialling out the variance of other covariates in each case, but its interpretation as a partial coefficient is dependent on the interpretation given to the other two predictors as a function of the coding scheme used. With dummy coding, the coefficient for the continuous predictor reflects the prediction of the dependent variable when the dichotomous variable equals zero. With weighted and unweighted effects coding, the coefficient for the continuous predictor reflects the average effect for men and women, where the “average” is weighted or unweighted according to sample size.

In sum, choice of coding scheme primarily reflects the type of research question the researcher wishes to investigate. Although this choice has no effect on the significance test for the interactions, it can effect the interpretation of the coefficients. Dummy codes are most appropriate when the researcher wishes to make a comparison to a clear baseline or control group. Unweighted effects codes would be most appropriate when no such comparison group exists and the researcher wishes to generalize to group populations of equal proportion. Weighted effects estimates are used when no clear comparison group exists, and the sample sizes are thought to reflect the distribution in the population.

Exploring Significant Interactions by Testing Simple Slopes

When a significant interaction between a dichotomous and continuous predictor is obtained, the interaction indicates that the slopes of the continuous variable in the two groups are significantly different from one another. As is the case with ANOVA, a significant interaction alone does not provide enough information about the nature of the interaction (Mossholder, Kemery, & Bedeian, 1990). There are a number of forms that the interaction may take. The relation between the continuous predictor and the criterion may be positive in one group and negative in another, the continuous predictor may only be related to the criterion in one group but not the other, or the association between the predictor and outcome may be similar in the two groups but just different in magnitude. In order to explore the nature of the interaction, one needs to conduct simple effects tests.

Rather than splitting the sample and conducting separate regressions after finding a significant interaction, it is most appropriate and most powerful to examine simple effects when the sample size and variability in the overall sample is taken into account. Tests of simple slopes with moderated regression, then, follow the same logic as simple effects tests that follow a significant interaction with ANOVA. Just as conducting separate t-tests after a significant interaction with ANOVA is not appropriate because the error term is not based on the full sample (e.g., Keppel, 1991), conducting separate follow-up regressions is also not appropriate.

There are several reasons for conducting a simple effects analysis instead of separate regressions. First, simple effects tests should be more powerful because the full sample size is utilized in the significance test (i.e., df = N - k -1, where N is the total sample size and k is the number of predictors). Second, the simple slope tests provide accurate estimates of the standard errors for each group coefficient. Third, because group differences may not exist with all covariates, separate regression runs for each group would result in partial regression estimates that control only for the variability of covariates as they occur within each group. As discussed above, sampling variability in covariation among predictors and the covariate-criterion association will produce poorer estimates of the predictor slope when separate, within-group regressions are performed.

Simple slope estimates can be computed by hand, but can also be conveniently obtained from a simple computer program. The Appendix contains an example of a program written in SPSS syntax (SPSS for Windows, Release 9.0, 1997) that will compute raw simple slope estimates, standard errors, standardized simple slopes, t-tests, and plot points.[x] The plot points make plotting the interaction a simple procedure that quickly illuminates the nature of the interaction.

To use the program, one needs to obtain the following information from the regression run testing the interaction: unstandardized regression coefficient for x, the continuous variable, z, the dichotomous variable, and xz, the interaction; the mean of x and z (both should be zero or very near zero, if the continuous variable is centered and weighted effects coding is used for the dichotomous variable); the standard deviations for x, z, and y, the dependent variable; the regression intercept (i.e., the constant); the variances estimates of bx and bxz and the covariance estimates of bz with bxz . Here, bx, bz, and bxz refer to the unstandardized coefficients for x, z, and the interaction term, respectively. The variance estimates of bx, bxz, and the covariance estimate of bx with bxz are obtained from the variance-covariance estimates of the unstandardized regression weights. Although not provided in the results by default, the variance-covariances of the estimates are available with most standard statistical packages if specified. It is important to note that decimals should be carried out to the farthest possible decimal place. The values of 0 and 1 are used because dummy coding for this analysis was used. If weighted or unweighted effects coding is to be used, 0 and 1 should be replaced by the unweighted (i.e., -1, +1) or weighted effects codes. If weighted effects codes are used, the codes will vary depending on the sample sizes of the two groups. If weighted effects were used for this example, 0 would be replaced with -.529 and 1 would remain the same. The program can also be used with continuous measures as well, but the values 0 and 1 would be replaced by specific values of z, such as -1 sd and +1 sd (Cohen & Cohen, 1983) to examine simple effects of x at particular points of z.

To determine if anxiety was a stronger predictor for males or females and whether anxiety was a significant predictor in each group, the simple slopes for the interaction were then tested using the program described above and then plotted in Figure 2. The tests of the simple slopes indicated that anxiety was significant predictor of depression among males (b = 7.26, t(44) = 4.02, p < .001) but was not a significant predictor of depression among females (b = 2.28, t(44) = 2.01, p < .10). For both simple slopes, the degrees of freedom are df = N - k -1 = 52 - 7 - 1 = 44, where N is the total sample size and k is the number of predictors in the interaction equation. As can be seen in the figure, plotting of the points results in nonparallel simple regression lines. Although the simple slopes equal those obtained in the subgroup analysis, the error variances differ from those in Table 1. For males, the standard error for the anxiety coefficient is 1.81 in the simple slope analysis as opposed to 2.15 obtained with the subgroup analysis. For females, the standard error for the anxiety coefficient is 1.14 in the simple slope analysis as opposed to 1.23 in the subgroup analysis.

An alternative procedure to the one presented here (see Aiken & West, 1991; Darlington, 1990; Jaccard et al., 1990, for a detailed description), capitalizes on the fact that, with dummy coding, the coefficient for the continuous predictor is equal to the simple slope of the continuous predictor when the dichotomous variable equals zero. Note that the simple slope for females is equal to the first order effect for anxiety in the interaction test, because the coefficient for the continuous variable represents the effect of that variable when the categorical variable equals zero. Roughly described, this procedure involves recoding the dichotomous predictor, recomputing the interaction term, and rerunning the regression analysis, to obtain the second simple slope estimate. A similar procedure can be used with weighted effects or continuous variables by subtracting particular values (e.g., -1 sd, +1 sd) instead of the mean when variables are centered.

Most often researchers will be interested in probing the interaction by testing for the significance of the slope of the continuous variable in each group. On occasion, however, the researcher may be interested in whether there is a significant difference between two points on the regression lines that fall at a specific value of the continuous variable (Johnson & Neyman, 1936). Because this approach to understanding the nature of a significant interaction between a continuous and a dichotomous predictor is relatively rare and excellent discussions of the appropriate analytic methods are available elsewhere (Aiken & West, 1991; Huitema, 1980), we will not consider it further here.

Standardized solution

Although the unstandardized solution obtained after centering variables is correct, one needs to conduct a separate regression run which uses a pre-standardized method to obtain the correct standardized solution for the interaction term (Aiken & West, 1991; Darlington, 1990; Friedrich, 1982). Although first order effects will be identical in the standardized solution output and the pre-standardized method, the coefficient for the interaction produced in the standardized solution will be incorrect. In many cases, the degree of inaccuracy will be small, but as the correlation between the predictors increases the inaccuracy increases. To obtain the correct standardized slope for the interaction term, all variables in the regression equation, except the interaction term, and the dependent variable should be standardized prior to the second regression run. The interaction term should then be recomputed as the product of the two standardized scores (but the product term is not standardized after it is computed). The unstandardized solution then provides the correct standardized coefficient for the interaction.

Assumptions

A few assumptions about regression analysis deserve special mention in the context of moderated regression. First, the interaction test assumes that curvilinear effects are not present. It has been shown that the presence of curvilinear effects may lead to spurious interactions effects (Busemeyer & Jones, 1983; Darlington, 1990; Lubinski & Humphreys, 1990). Second, homogeneity of error variances is assumed across groups (Alexander & Deshon, 1994; Cochran, 1957; Kendall & Stuart, 1979). Although small differences in error variances are not likely to have important effects in moderated regression, conclusions may be affected if the means of comparison is visual inspection, as when the subgroup method is used. Researchers are encouraged to plot residuals and test for variance differences (DeShon & Alexander, 1996). Third, as with all regression analyses, it is assumed that predictor variables are measured without error. Particular attention to measurement error is recommended when testing for interactions, because measurement error in individual predictors may be compounded in the product term (Aiken & West, 1991; Cohen & Cohen, 1983; Bohrnstedt & Marwell, 1978). Recommendations regarding the correction of measurement error can be found elsewhere (Aiken & West, 1991; Kenny & Judd, 1984; Evans, 1985; Jaccard & Wan, 1996; Ping, 1996). Fourth, when larger numbers of predictors are required, it is especially important that a systematic approach to testing the possible interactions be taken. Several approaches to prioritizing these tests have been recommended (Aiken & West, 1991; Cohen & Cohen, 1983; Darlington, 1990). Faced with a multitude of possible tests, researchers are urged to plan interaction tests according to theory.

Comments on higher order interactions

We have not discussed the testing of higher order interactions (i.e., those involving the product of more than 2 predictors), but these tests are also easily conducted. In testing higher order interactions, all lower order interactions and first order effects terms should be included in the analysis. Testing a three-way interaction, for instance, involves also testing three two-way interactions and three first order effects. Because tests for higher order interactions follow the general procedures and rationale provided here, and because adequate discussions of these issues exist elsewhere (Aiken & West, 1991; West et al., 1996), this topic will not be reviewed here. There are, however, a few brief points that should be made.

Choice of coding for categorical variables becomes even more crucial when higher order interactions are tested. Although dummy coding and weighted effects coding will have identical coefficients for the highest order term (i.e., the three-way interaction) in the two-group case, coefficients for the two-way interactions and the first order effects will not be identical across coding methods. If the categorical variable represents more than two categories, the choice of coding scheme will also affect the highest order terms (Aiken & Molina, 1992; West et al., 1996).

Outlining the advantages and disadvantages of the subgroup analysis and the moderator approach becomes more of a task when higher order interactions are compared. Without a detailed discussion, however, it should suffice to point out that similar disadvantages apply when higher order interaction hypotheses are pursued with the subgroup approach. As is the case with two-way interactions, subgroup analysis used to test three-way interactions will lead to lower statistical power within groups, a compounding of the covariance problem, and greater difficulty drawing conclusions about differences based on visual inspection of the coefficients. Moreover, procedures for probing the nature of three-way interactions exist for the moderator approach (Aiken & West, 1991; West et al., 1996), but no clear approach to comparisons among groups exists for the subgroup method. Comparisons between individual predictors is restricted to visual inspection of within-group coefficients and pairwise comparisons. With the moderator approach, one is able to test all first order effects, all possible two-way effects, and the three-way interaction together. Examining individual, within-group regression analyses in the three-way case may not enable the researcher to detect two way interactions that may exist.

More than two groups

Although our examples have only included categorical variables using the two-group case, interaction tests are also possible with more than two groups. In general, one must create g - 1 dummy variables, where g is number of groups or the number of levels of the categorical predictor. Thus, there are also g - 1 possible interaction terms because each interaction term is a product of one of the dummy variables and the continuous variable. This can be done using dummy coding, unweighted, or weighted effects coding schemes. Choice of coding has a greater impact on the interactions tested with multiple category variables, because each of the coding schemes produce different values for the highest order term (West et al., 1996). Thus, one must think carefully about the choice of coding scheme and the interpretation of the interactions tested.

Summary and Conclusion

Many scientific questions about gerontology are questions about group differences in prediction. Such questions can be viewed in terms of interaction or moderator effects. Testing interactions within the context of experimental designs usually poses few problems because the independent variables are categorical and the ANOVA methods typically used in this context are well-understood. Because applied research in this area is mostly comprised of quasi-experimental designs or a combination of experimental and non-experimental design components, however, the independent variables involved are usually a mixture of continuous and categorical variables. In such circumstances, researchers use a variety of approaches to analyzing their data. One common approach, the subgroup approach in which the researcher divides the sample into two or more groups and conducts separate regression analyses, leads to a number of statistical and interpretational problems. The most important problems with the subgroup method include the failure to take into account group differences due to sampling error, inability to adequately control for variables confounded with group membership, a loss of power due reduced sample size and/or artificial dichotomization, less stable regression estimates, and difficulties with standardization. The moderated regression approach, which comes from a general approach to testing interactions, has none of these shortcomings.

Although the present treatment of moderated regression is an important starting point for those wishing to adopt the approach, researchers are urged to learn more about the approach by consulting other sources. More detailed descriptions of the statistical underpinnings of the general moderated regression approach are available elsewhere (Aiken and West, 1991; Cohen & Cohen, 1983; Darlington, 1990; Jaccard, Turrisi, & Wan, 1990). In addition, an extension of moderated regression methods is available within a structural equation modeling context (Jaccard & Wan, 1996; Ping, 1996; Kenny & Judd, 1984). The use of structural modeling to test hypotheses about group differences offers additional flexibility and is often more powerful because of the ability to estimate measurement error. Finally, examples of the use and presentation of new analyses are often helpful to researchers when they are presenting findings for the first time. Although moderated regression analysis is not as commonly found as it should be, the gerontology literature does provide a number of examples (Krause, 1995; Miller, Campbell, Farran, Kaufman, & Davis, 1995; Zautra, Reich, & Newsom, 1995)

Our purpose in this article is to highlight the limitations of the subgroup approach so that researchers working in the area of gerontology can begin to adopt more powerful and flexible data analytic methods. The implementation of such new techniques often lags well behind their development (West, Newsom, & Fenaughty, 1991). Although articles such as the present one will help inform researchers of new analytic techniques, pressure from editors and reviewers is usually required before the adoption of such methods is ubiquitous. By incorporating the moderated regression approach and other new techniques in training for graduate students in gerontology, the methodological and statistical rigor of the field will also be advanced.

References

Aiken, L.S., & Molina, B. (1992, August). Interactions between categorical and continuous variables in multiple regression: Some clarifications. Paper presented at the annual meeting of the American Psychological Association. Washington, D.C.

Aiken, L.S., & West, S.G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage.

Alexander, R.A., & DeShon, R.P. (1994). Effect of error variance heterogeneity on the power of tests for regression slope differences. Psychological Bulletin, 115, 308-314.

Baron, R.M., & Kenny, D.A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 46, 1173-1182.

Beery, L.C., Prigerson, H.G., Bierhals, A.J., Santucci, L.M., Newsom, J.T., Maciejewski, P.K, Rapp, S.R., Fasiczka, A., Reynolds III, C.F. (1997). Traumatic grief, depression and caregiving in elderly spouses of the terminally ill. Omega - Journal of Death & Dying, 35, 261-279.

Bissonnette, V., Ickes, W., Bernstein, I., & Knowles, E. (1990a). Personality moderating variables: A warning about statistical artifact and a comparison of analytic techniques. Journal of Personality, 58, 567-587.

Bissonnette, V., Ickes, W., Bernstein, I., & Knowles, E. (1990b). Item variances and median splits: Some discouraging and disquieting findings. Journal of Personality, 58, 595-601.

Bohrnstedt, G.W., Marwell, G. (1978). The reliability of producs of two random variables. In K.F. Schuessler (Ed.), Sociological methodology (pp. 254-273). San Francisco: Jossey-Bass.

Bollen, K.A., & Barb, K.H. (1981). Pearson’s R and coarsely categorized measures. American Sociological Review, 46, 232-239.

Borenstein, M. & Cohen, J. (1988). Statistical power analysis: A computer program. Hillsdale, NJ: Erlbaum.

Busemeyer, J.R., & Jones., L. (1983). Analysis of multiplicative combination rules when the causal variables are measured with error. Psychological Bulletin, 93, 549-562.

Buysee, D., Reynolds, C.F., Monk, T.H., Berman, S.R., & Kupfer, D.J. (1989). The Pittsburgh Sleep Quality Index: A new instrument for psychiatric practice and research. Psychiatry Research, 28, 193-213.

Cochran, W.G. (1957). Analysis of covariance: Its nature and uses. Biometics, 13, 261-281.

Cohen, J. (1983). The cost of dichotomization. Applied Psychological Measurement, 7, 249-253.

Cohen., J., & Cohen, P. (1983). Applied multiple regression/correlation analyses for the behavioral sciences. Hillsdale, NJ: Lawrence Erlbaum.

Cronbach, L. J. (1987). Statistical tests for moderator variables: Flaws in analyses recently proposed. Psychological Bulletin, 102, 414-417.

Darlington, R.B. (1990). Regression and linear models. New York: McGraw-Hill.

Derogatis, L.R. (1983). SCL-90-R Manual II. Towson, MD: Clinical Psychometric Research.

DeShon, R.P., & Alexander, R.A. (1996). Alternative procedures for testing regression slope homogeneity when group error variances are unequal. Psychological Methods, 1, 261-277.

Evans, M.G. (1985). A Monte Carlo study of the effects of correlated method variacne in moderated multiple regression analysis. Organizational Behavior and Human Decision Processes, 36, 305-323.

Eysenck, H.J. (1956). The inheritance of extraversion-introversion. Acta Psychologica, 12, 95-110.

Fried, L. P., Borhani, N. O., & Enright, P., Furberg, C. D., Gardin, J. M., Kronmal, R.A., Kuller, L. H., Manolio, T. A., Mittelmark, M. G., Newman, A., O’Leary, D. H., Psaty, B., Rautaharju, P., Tracy, R. P., & Weiler, P.G. (1991). The cardiovascular health study: Design and rationale. Annals of Epidemiology, 1, 263-276.

Friedrich, R. J. (1982). In defense of multiplicative terms in multiple regression equations. American Journal of Political Science, 26, 797-833.

Gilhooly, M.L.M. (1984). The impact of care-giving on care-givers: Factors associated with the psychological well-being of people supporting a dementing relative in the community. British Journal of Medical Psychology, 57, 35-44.

Hamilton, M. (1960). A rating scale for depression. Journal of Neurology, Neurosurgery, and Psychiatry, 23, 56-62.

Hargens, L.L. (1976). A note on standardized coefficients as structural parameters. Sociological Methods and Research, 5, 247-256.

Hardy, M. A. (1993). Regression with dummy variables. Newbury Park, CA: Sage.

Huitema, B.E. (1980). The analysis of covariance and alternatives. New York: John Wiley.

Humphreys, L.G. (1978). Research on individual differences requires correlational analysis, not ANOVA. Intelligence, 2, 1-5.

Humphreys, L.G., & Fleishman, A. (1974). Pseudo-orthogonal and other anlysis of variance designs involving individual-differences variables. Journal of Educational Psychology, 66, 464-472.

Jaccard, J., Turrisi, R., & Wan, C.K. (1990). Interactions effects in muliple regression. Newbury Park, CA: Sage.

Jaccard, J., & Wan, C.K. (1996). Lisrel approaches to interaction effects in muliple regression. Thousand Oaks, CA: Sage.

Johnson, P.O., & Neyman, J. (1936). Tests of certain linear hypotheses and their applications to some educational problems. Statistical Research Memoirs, 1, 57-93.

Kendall, M., & Stuart, A. (1979). The advanced theory of statistics (Vol. 2, 4th Ed.). London: Charles Griffin and Company.

Kenney, D.A., & Judd, C.M. (1984). Estimating the nonlinear and interactive effects of latent variables. Psychological Bulletin, 96, 201-210.

Keppel, G. (1991). Design and analysis: A researchers handbook (3rd Ed.). Englewood Cliffs, NJ: Prentice Hall.

Kim, J. O., & Ferree, G. D., Jr. (1981). Standardization in causal analysis. Sociological Methods and Research, 10, 187-210.

Kim, J. O., & Mueller, C. W. (1976). Standardized and unstandardized coefficients in causal analysis: An expository note. Sociological Methods and Research, 4, 428-438.

Krause, N. (1995). Assessing stress-buffering effects: A cautionary note. Psychology and Aging, 10, 518-526.

Lubinski, D., & Humphreys, L.G. (1990). Assessing spurious “moderator effects”: Illustrated substantively with the hypothesized (“synergistic”) relation between spatial and mathematical ability. Psychological Bulletin, 107, 385-393.

Marquardt, D. W. (1980). Comment: You should standardize the predictor variables in your regression models. Journal of the American Statistical Association, 75, 87-91.

Marquardt, D. W., & Snee, R. D. (1975). Ridge regression in practice. American Statistician, 29, 3-20.

Maxwell, S.E., & Delaney, H.D. (1993). Bivariate median splits and spurious significance. Psychological Bulletin, 113, 181-190.

Maxwell, S.E., Delaney, H.D., & Dill, C.A. (1984). Another look at ANCOVA versus blocking. Psychological Bulletin, 95, 136-147.

McNemar, Q. (1969). Psychological statistics (4th ed.). New York: Wiley.

Miller, B, Campbell, R.T., Farran, C.J., Kaufman, J.F., & Davis, L. (1995). Race, control, mastery, and caregiver distress. Journal of Gerontology: Social Sciences, 10, S374-S382.

Monk, T.H., Flaherty, J.F., Frank, E., Hoskinson, K., & Kupfer, D.J. (1990). The social rhythm metric: An instrument to quantify the daily rhythm of life. Journal of Nervous and Mental Disease, 178, 120-126.

Mossholder, K.W., Kemery, E.R., & Bedeian, A.G. (1990). On using regression coefficients to interpret moderator effects. Educational and Psychological Measurement, 50, 255-263.

Mossholder, Kemery, Bedeian (1990). On using regression coefficients to interpret moderator effects. Educational and Psychological Measurement, 50, 255-263

Peters, C. C., & Van Voorhis, W. R. (1940) Statistical procedures and their mathematical bases. New York: McGraw-Hill.

Ping, R. A. (1996). Latent variable interaction and quadratic effect estimation: A two-step technique using structural equation analysis. Psychological Bulletin, 119, 166-175.

Prigerson, H.G., Bierhals, A. J., Maciejewski, P. K., Newsom, J.T., Frank, E., Shear, K. M., Houck, P. R.,Diebold, J., Miller, M, Reynolds, C. F. (1998). Predictors of depression among aged widows and widowers: A literature review and preliminary results. Unpublished Manuscript, Yale Unviersity.

Radloff, L. (1977). The CES-D Scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385-401.

Russo, J., Vitaliano, P.P., Brewer, D.D, Katon, W., & Becker, J. (1995). Psychiatric disorders in spouse caregivers of care-recipients with Alzheimer’s disease and matched controls: A diathesis-stress model of psychopathology. Journal of Abnormal Psychology, 104, 197-204.

Saunders, D. R. (1955). The “moderator variable” as a useful tool in prediction. In Proceedings of the 1954 Invitational Conference on Testing Problems (pp. 54-58). Princeton, NJ: Educational Testing Service.

Saunders, D. R. (1956). Moderator variables in prediction. Educational and Psychological Measurement, 16, 209-222.

Serlin, R.C., & Levin, J.R. (1985). Teaching how to derive directly interpretable coding schemes for multiple regression analysis. Journal of Educational Statistics, 10, 223-238.

Schulz, R., Newsom, J.T., Burton, L., Hirsch, C., Jackson, S., & Mittlemark, M. (in press). Health effects of caregiving: The Caregiver Health Effects Study. Annals of Behavioral Medicine.

SPSS reference guide. (1990). Chicago, IL: SPSS, Inc.

Stone, E.F., & Hollenbeck, J.R. (1984). Some issues associated with the use of moderated regression. Organizational Behavior and Human Performance, 34, 195-213.

West, S.G., Aiken, L.S., & Krull, J.L. (1996). Experimental personality designs: Analyzing categorical by continuous variable interactions. Journal of Personality, 64, 1-48.

West, S. G., Newsom, J. T., & Fenaughty, A. M. (1992). Publication trends in JPSP: Stability and change in the topics, methods, and theories across two decades. Personality and Social Psychology Bulletin, 18, 473-484.

Winer, B.J., Brown, D.R., & Michels, K.M. (1991). Statistical principals in experimental design (3rd Ed.). New York: McGraw-Hill.

Zautra, A. J., Reich, J. W., & Newsom, J. T. (1995). Autonomy and sense of control among older adults: An examination of their effects on mental health. In L. Bond, S. Cutler, & A. Grams (Eds.), Promoting successful and productive aging, (pp. 153-170). Sage Publications: Newbury Park, CA

Table 1. Subgroup regression analysis of the gender differences in the prediction of depression symptomatology.

Males (n = 18) Females (n = 34)

b β Seb CIb p b β Seb CIb p

Age -.202 -.153 .201 -.630,.226 ns -.070 -.062 .127 -.329,.189 ns

Educ .824 .292 .495 -.220,1.869 .12 .593 .223 .318 -.053,1.240 .07

ALI -.071 -.134 .108 -.299,.156 ns -.114 -.211 .057 -.229,.002 .05

PSQI .915 .526 .319 .242,1.588 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download