GMA and Cognitive Abilities as Predictors of Work ...



International Validity Generalization of GMA and Cognitive Abilities:

A European Community Meta-analysis

Jesús F. Salgado

University of Santiago de Compostela, Spain

Neil Anderson

University of Amsterdam, The Netherlands

Silvia Moscoso

University of Santiago de Compostela, Spain

Cristina Bertua

Goldsmiths College, University of London, Great Britain

Filip de Fruyt

Ghent University, Belgium

In press – Personnel Psychology

Abstract

This article reports on a series of meta-analyses into the criterion validity of GMA and specific cognitive ability tests for predicting job performance ratings and training success in the European Community (EC). Meta-analyses were computed on a large European Community database examining the operational validity of GMA, and other specific cognitive abilities, including verbal, numerical, spatial/mechanical, perceptual and memory (N ranged from 946 to 16,065) across ten EC member countries. The results showed that tests of GMA and specific cognitive ability are very good predictors of job performance and training success across the EC. Evidence for the international validity generalization of GMA and specific cognitive abilities was presented. The results for the European Community meta-analyses showed a larger operational validity than previous meta-analyses in the USA for predicting job performance. For training success, the European and American results are very similar. Implications for the international generalizability of GMA test validities, practical use of cognitive ability tests for personnel selection, and directions for future research are discussed.

International Validity Generalization of GMA and Cognitive Abilities as Predictors of Work Behaviors: A European Community Meta-analysis

Tests of general mental ability (GMA) and specific cognitive ability are popular elements in organizational selection procedures in both the USA and the European Community (EC). Over the years many meta-analyses of the predictive or criterion-related validity of GMA tests have been performed in the USA (e.g. Hunter & Hunter, 1984; Hunter & Hirsh, 1987; Pearlman, Schmidt & Hunter, 1980; Levine, Spector, Menon, Narayanan & Cannon-Bowers, 1996; Schmitt, Gooding, Noe & Kirsch, 1984). These meta-analyses have found that GMA and cognitive ability tests are valid predictors of job performance and training success and furthermore, that validity generalizes across samples, criteria, and occupations (see Schmidt, 2002 for an extensive review of all major studies). For example, in a seminal article, Hunter and Hunter (1984) re-analyzed the results of previous studies consisting of 515 primary studies, and also reported the main results of a series of meta-analyses carried out with the USES database of the General Aptitude Test Battery (GATB) validity studies. The average corrected predictive validity of the ability composite was .53 for predicting job performance. Many other meta-analyses were published in the US in the last 25 years and the conclusions were convergent claiming that GMA measures are the best single predictors of job performance and training success (see Schmidt, 2002 for a review of these findings; Schmidt & Hunter, 1998). In this sense, Murphy (2002) has suggested that GMA measures are likely to be correlated with performance in virtually any job.

The criterion validity of specific cognitive abilities (i.e. verbal, numerical, spatial, and memory) was a further issue examined in previous meta-analyses of the validity studies conducted in civil settings and in large-sample validity studies conducted with military samples in the US. For example, meta-analyses by Schmidt, Hunter and their colleagues (Hunter and Hunter, 1984; Pearlman, Schmidt, & Hunter, 1980; Schmidt Hunter & Caplan, 1981; Schmidt, Hunter & Pearlman, 1981; Schmidt, Hunter, Pearlman, & Shane, 1981) demonstrated that specific abilities showed small validity over and beyond GMA measures or cognitive composites. Ree and his colleagues have examined the validity of the specific ablities for predicting job performance and training success using large military samples (Carretta & Ree, 1997, 2000, 2001; Olea & Ree, 1994; Ree & Earles, 1991; Ree, Earles, & Teachout, 1994). They found that the specific abilities showed very small contributions over GMA for predicting these two criteria. However, some authors have suggested that the use of specific abilities can have important implications for personnel selection. For example, Kehoe (2002) has suggested that if the selection decision was based on specific abilities rather than on GMA measures, the specific abilities can reduce the group differences and may result in more positive applicant reactions. Murphy (2002) has suggested that different applicants may be selected depending on the specific abilities emphasized. Consequently, these are compelling reasons for examining the magnitude of the specific abilities in the European Community.

Even in the light of their important findings, these meta-analyses still contain several characteristics that limit the generalized applicability of their findings. For example, (a) they only included American primary studies, and therefore the international validity generalization was not examined, (b) they did examine the predictive validity of specific cognitive abilities (e.g. verbal, numerical, spatial), but due to the differential characteristics of the EC, the magnitude of the relations between the specific cognitive abilities and performance might be different from the relations found in the US. These two characteristics are particularly relevant, as the international validity generalization of these findings cannot be taken for granted based on these data sets. Therefore, it is particularly important to establish the criterion-related validity of GMA tests in countries other than the USA in order to demonstrate international validity generalization regardless of cultural, economic, social and legislative differences between the US and European Community countries (Herriot & Anderson, 1997). As Newell and Tansley (2001) point out, societal effects, institutional networks, national regulatory environment, economic factors, and national culture, may all have impact upon the use and criterion-related validity of personnel selection methods.

GMA and Cognitive Ability Testing in the EC

The international validity generalization of GMA and cognitive ability tests for predicting job performance and training success is an area of particular relevance for EC countries. According to different surveys (see Salgado & Anderson, 2002 for a review), cognitive ability tests are, in general, more prevalently used for selection purposes in the EC than in the USA. The European Community is a large multi-national and multi-cultural community consisting at present of 15 member countries, including Austria, Belgium, Denmark, Finland, France, Germany, Greece, Ireland, Italy, Luxembourg, Portugal, Spain, Sweden, United Kingdom, and The Netherlands, but with a large number of other countries waiting to become new members in the near future. The current population is over 400 million people. An illustration of the diversity within the EC is the co-existence of a number of individualistic cultures (e.g. United Kingdom, Germany, The Netherlands) and collectivist cultures (e.g. France, Italy, Spain, Portugal, Greece) (see Hofstede, 1980, Spector et al., 2001). In some countries (e.g. United Kingdom, The Netherlands) the Protestant religion is dominant, while in others, the Catholic religion (e.g. Austria, Italy, Spain, Portugal) or other religions (e.g. the Orthodox religion in Greece) are more prevalent. Consequently, such a mix of typographies could produce a differential effect on the validity of GMA and other cognitive abilities (e.g. verbal, numerical, spatial), due to the fact that different systems of values, approaches to power, supervision styles, and interpersonal communication may be operating simultaneously within the EC.

According to Levy-Leboyer (1994), there are important differences between the US and the Europe Community in personnel selection practices. For example, the American selection psychologists “have been strongly influenced by lawsuits requiring proofs of the validity decisions that were made based on test results, so that work psychologists in the United States have concentrated on legitimizing their selection procedures by showing non-discrimination. European work psychologists were (and still are) much more concerned than US psychologists with the need to protect applicants by defending their privacy; they see selection as a participative process, where both the applicant and the organization must have their interest represented. In Europe, public action is focused on limiting an organization’s right to access an applicant’s privacy (Levy-Leboyer, 1994, pp183-184).” Other researchers have also shown that the meaning of selection is different in the European Community and the United States. For example, Roe (1989) pointed out that the European view is not limited to the classical psychometric model of prediction (i.e. the validity of the procedure) and the whole selection process is characterized as a negotiation between employers and applicants in which they must make a decision considering the cost and the benefits, taking into account the societal context of selection, such as legal regulations and the labor market conditions (e.g. unemployment rates; degree of diversity in the countries). Indeed, the 15 current members of the European Community are not completely homogenous in their practices of personnel selection, but as Viswesvaran and Ones (2002) also pointed out, the European Community countries individually considered are relatively homogenous in comparison with the US as they have less within-country diversity. Other relevant differences between the US and the European Community are, for example, unemployment rates larger in the EC than in the US, large variability in the educational systems, and differences in the demographics. For example, the differences in demographic variables within the countries can also have effects on the answers to the test items, reflecting such socio-demographic differences.

Why EC validities may be lower?.

There are also differences between the US and the EC with regard to the meaning of job performance. Companies in the European Community may be more interested in the contextual dimensions of job performance than in task performance dimensions (Borman et al., 2001). This may be attributed to the effects of the collectivistic values of some national cultures (e.g. Spain, France, Belgium, Portugal) and the European view of the personnel selection process as one of negotiation in which the personal characteristics are particularly relevant. Murphy and Shiarella (1997) have suggested that organizations may differ in their definitions of what constitutes good job performance and the differences in the definitions have implications for the validity of selection tests. For example, as Borman et al. (2001) have speculated that GMA measures would show lower validity for predicting contextual performance than for task performance. Therefore, if the European conceptualization of what is good job performance included more contextual connotations relative to the conceptualization in the US, then one would expect that GMA and cognitive ability measures show lower predictive validity in Europe than in the US.

With regard to the measurement of training performance, there are also remarkable differences. In the European Community the majority of the studies assess training success using supervisory ratings rather than using objective tests, as it is typical in the US studies. The objective measures used in training studies are often assessments of job knowledge while the ratings are assessments of job performance. This is a relevant difference between the EC and the US, because it is well known that there are larger differences in reliability in these two types of measures. Furthermore, the American studies may be interested in the maximal performance of the individuals as it is assessed with objective measures while the European studies may be more interested in the typical performance, as it is assessed with ratings (Ackerman, 1994; Ackerman & Humphreys, 1991). Consequently, one would expect that the validity for training success would be lower in the EC than in the US.

Moreover all these contextual and conceptual differences between the US and the European view of the personnel selection process allow to speculate that there may exist other factors producing differences in the magnitude of the relations between GMA-work performance in the European Community. For example, there could exist differences in the reliability of performance measures and these differences would affect the criterion-related valdity. Also, the differences in the unemployment rates could produce differences in the selection ratio that would produce differences in the restriction of range in the GMA measures and, subsequently, the range restriction underestimates the criterion validity differentially. Therefore, the magnitude of GMA validity could be lower in the EC than in the US. Furthermore, as the European approach is more concerned with the societal effects than with the predictive validity of the personnel selection processes, there will exist less validity studies in the European Community than in the US.

Why EC validities may be higher?.

Currently, over 20 different languages are spoken throughout the current member countries but the number of languages may be doubled, or even tripled, with the imminent joining of new member states. This phenomenon together with the powerful cultural differences within the EC countries relative to the smaller differences within the US regions could make working in the EC more complex, and hence increasing the demand for GMA. Furthermore, as a whole, the majority of jobs included in these studies are medium or high complexity jobs. Hunter and Hunter (1984) demonstrated that job complexity is a powerful moderator of GMA validity and they found that GMA validity is remarkably larger for high complex jobs relative to low complexity ones. Consequently, it is possible that these two factors (i.e. many languages and more complex jobs) operate to produce a larger operational validity in the EC than in the USA,

While it is true that international validity generalization may exist, as Schmidt, Ones, and Hunter (1992) and Salgado (1997) firstly conjectured, and that the findings may be transferable to other countries, there is no empirical support for this claim because there are only American meta-analyses carried out exclusively with American primary studies. Therefore, a meta-analytic review of the validity studies conducted in the European Community is called for. Such a study would make three important contributions: (a) to report the results for the European Community countries, (b) to compare the European Community findings with the American ones, and (c) to explore if there is international validity generalization for GMA and specific cognitive measures. In view of this, this paper reports the results of new meta-analyses conducted on a very large database of validity studies of GMA and specific cognitive abilities conducted in EC member countries. Provided the considerations stated previously, our study has two main objectives: (a) to present the results of a European quantitative synthesis on the relationship between GMA and job performance and training success; and (b) to examine the magnitude of the predictive validity of specific abilities for predicting job performance and training success.

Method

Search for Studies

Based on the goals of this research, a database was developed containing European validity studies. These studies had to meet three criteria in order to be included within the data base: a) to report on validity coefficients between job performance and training success measures and cognitive ability measures; b) the samples should be applicants, employees or trainees, but not students; c) only civilian jobs would be considered, excluding military occupations, in order to compare the present meta-analytic results with previous American findings.

The search was made using seven strategies. First, a computer search was conducted in the PsycLit database. Second, an article-by-article manual search was carried out in a large number of European journals. This list includes: Journal of Occupational and Organizational Psychology, Journal of the National Institute of Industrial Psychology, Human Factor, Occupational Psychology, British Journal of Psychology, The Occupational Psychologists, International Journal of Selection and Assessment, Revista de Psicología General y Aplicada, Psicotecnia, Revista de Psicología del Trabajo y las Organizaciones, Bolletino di Psicologia Applicata, Travail Humain, Bulletin du CERP, BINOP, Review de Psychologie Appliquee, Journal of Organizational Behavior, Zeitschrift fur Argewande Psychologie, Praktische Psychologie, Industrielle Psychotechnik, Ergonomics, Applied Psychology: An International Review, Annals de l'Institut de Orientacio, L'Annee Psychologique, Irish Journal of Psychology, Reports of the Industrial Fatigue Research Board, Reports of the Industrial Health Research Board, Scandinavian Journal of Psychology, Revue Belge de Psychologie et Pedagogie, Psychologica Belgica, Nederlands Tijdschrift voor Psychologie, International Journal of Aviation Psychology, Accidents Analysis and Prevention, Transport Journal. We also reviewed the following American journals: Journal of Applied Psychology, Personnel Psychology, Journal of Personnel Research, Industrial Psychology, Psychological Bulletin. Third, we reviewed the Proceedings of the Congress of the International Association of Psychotechnique (later called International Association of Applied Psychology) since 1920 to 1998. Fourth, test manuals were checked looking for criterion validity studies. Fifth, several test publishing companies were contacted and asked for reports in which validity studies were reported. Sixth, the reference sections of articles obtained were checked to identify further papers. Finally, several well-know European researchers were contacted in order to obtain additional papers and supplementary information related to published papers (e.g. complete matrix of correlation coefficients).

The literature search resulted in 102 papers, including 234 independent samples. These papers consisted of 80 (163 samples) published and 22 (71 samples) unpublished studies. Combining the published and unpublished studies conducted across the European Community, the final database contained 142 independent samples with training success as the criterion and 120 samples with job performance as criterion. Training success and job performance were simultaneously used within 14 samples. The following countries contributed studies to the database: Belgium (3), France (18), Germany (9), Ireland (1), The Netherlands (13), Portugal (1), Scandinavian Countries (2), Spain (18), and the United Kingdom (37). Consequently, not all the current members of the EC are represented in the database. Specifically, we have not found studies from Austria, Italy, Greece and Luxembourg. The reasons are different for the various countries. For example, cognitive ability tests are rarely or not at all used in Italy or Greece (see Ryan, MacFraland, Baron, & Page, 1999) and studies are not published in Finland (Levy-Leboyer, 1994; Petteri Niitamo, personal communication). The total population of the five countries represents less than 20% of the population of the European Community.

Procedure

Two researchers served as judges, working independently to code every study. Each researcher was given a list and a definition of the abilities. The abilities used in this research were general mental ability (GMA), verbal ability, numerical ability, spatial/mechanical ability, perceptual ability, and memory. Examples of the tests included in each ability category are given in Table 1. If the two researchers agreed on the ability, the test was coded in that ability category. Disagreements were solved through discussion until the researchers agreed on the classification of the ability. The studies conducted in the United Kingdom (36% of the total) were used to examine the reliability of the coding process. Agreement between researchers (prior to the consensus) was .94, .90, .93, .87, .85, and 1 for general mental ability, verbal ability, numerical ability, spatial/mechanical ability, perceptual ability, and memory, respectively. All single tests were then assigned to a single ability. Only one overall validity coefficient was used per sample for each ability condition. In situations in which we found more than one coefficient for each ability (e.g. several tests were used to assess the same ability), they were considered as conceptual replications and linear composites with unit weights for the components were formed. Linear composites provide more construct valid estimates than the use of the average correlation. Nunnally (1978, pp. 166-168) and Hunter and Schmidt (1990, pp. 457-463) provide formulas for the correlation of variables with composites.

Insert Table 1 about here

After the studies were collated and their characteristics recorded, the following step was to apply the psychometric meta-analytic formulas of Hunter and Schmidt (1990). Psychometric meta-analysis estimates how much of the observed variance of findings across studies is due to artifactual errors. The artifacts considered here were sampling error, criterion and predictor reliability and range restriction in ability scores. Some of these artifacts reduce the correlation below its operational value (e.g. criterion reliability and range restriction) and all of them produce artifactual variability in the observed validity. In our analyses, we corrected the observed mean validity for criterion reliability and range restriction in the predictor. In order to correct the empirical validity for these last three artifacts the most common strategy is to develop specific distributions for each of them.

Predictor Reliability

According to Schmidt and Hunter (1999), the ideal reliability coefficient is the coefficient of equivalence and stability, which is estimated as the correlation between two parallel forms given on different occasions. However, since this type of coefficient was not reported in any of the studies, test-retest reliability was used. This affords an accurate alternative since it is only consequential in the estimation of the variability in rho, if at all. Consequently, the reliability of predictors was estimated: (1) from the coefficients reported in the studies included in the meta-analysis, and (2) using the coefficients published in the various test manuals. For each ability, an empirical distribution of test-retest reliability was developed (see Table 2). The average reliabilities were .83, .83, .85, .77, .67, and .64 for GMA, verbal ability, numerical ability, spatial/mechanical ability, perceptual ability, and memory, respectively. As the interest is in the operational validity of the cognitive abilities and not in its theoretical value, predictor reliability estimates are only used to eliminate artifactual variability in SDrho.

Insert Table 2 about here

Criterion Reliability

In the present research, only validity studies in which job performance ratings and training success were used as criterion were considered. This choice was based on two arguments: (1) American meta-analyses have only used these two criteria and one of our objectives is to provide a comparison with those meta-analyses; (2) other criteria such as tenure, turnover, promotions or output have been used in only a very small number of studies and not with all cognitive abilities, therefore, we would not be able to carry out meta-analyses for all abilities. Since not all of the studies provided information regarding the criterion reliability, empirical distributions were developed to estimate criterion reliability for both these criteria. These were developed using those reliability coefficients reported in a number of studies, and are presented in Table 3.

For job performance ratings, the inter-rater reliability coefficient is the one of interest (Hunter, 1986; Schmidt and Hunter, 1996) since this type of reliability corrects for most of the unsystematic error in supervisor ratings (Hunter and Hirsh, 1987). We found 19 studies reporting inter-rater coefficients of job performance ratings, producing an average coefficient weighted by sample size of .52. This coefficient is slightly lower than the coefficient used by Hunter and Hunter (1984), but interestingly, it is exactly the same as the coefficient found by Viswesvaran, Ones, and Schmidt (1996) in their meta-analysis of the inter-rater reliability of job performance ratings. All the coefficients included in Viswesvaran et al’s database are computed using American studies. Consequently, it is interesting to note that we arrived at the same value using an independent data set. Additionally Rothstein (1990) showed that the asymptotic value for job performance ratings was .52.

In the case of training success, ratings by trainers or supervisors were used in the majority of primary studies included in our meta-analyses. A small number of primary studies used pass/fail qualifications given by the supervisor or the ratings given by trainers or external examiners in both theoretical and practical examinations. Therefore, all training success scores used in the present meta-analysis can be considered as a form of ratings. In fact, when a theoretical and a practical examination were given to the trainees, the correlation between examination ratings ranged from .46 to .73. 15 studies reporting training success reliability were found in total, leading to a weighted sample-size average reliability of .56. This figure is remarkably lower than the value of .81 used by Hunter and Hunter (1984). However, we considered our estimate to be representative of training success reliability in the EC. The difference with Hunter and Hunter’s estimate is due to three reasons: (1) it was empirically developed using civilian studies while Hunter and Hunter’s estimate was an assumed one; (2) Hunter’s estimate was found for training success measures in the USA Navy (Hunter, 1986) and these measures typically consisted of objective examinations (Vineberg & Joyner, 1989), (3) all the coefficients found in this research were calculated using ratings given by trainers or supervisors while objective examinations may be more frequent in the USA.

Insert Table 3 about here

Range Restriction Distribution

The distribution for range restriction was based on two strategies: (a) some range restriction coefficients were obtained from the studies that reported both restricted and unrestricted standard deviation data, and (b) another group of range restriction coefficients was obtained using the reported selection ratio. In order to use the reported selection ratio, we used the formula reported by Schmidt, Hunter and Urry (1976). This double strategy produced a large number of range restriction estimates. We coded these estimates according to the criterion used in the study and used only one coefficient for each sample. In cases where we had several estimates for the same sample, the average was calculated and this figure was used for computing the empirical distributions. The descriptive statistics of the two distributions appear in Table 4. We found a range restriction ratio (u) of .67 for training success and .62 for job performance. These figures are very similar to those found by Hunter and Hunter (1984) in the USA studies, since they found .60 for training and .67 for job performance ratings. In order to be sure that the two empirical distributions are representative of the restriction in the abilities, we also developed specific empirical distributions for each ability, and the results were very similar (although calculated on a smaller number of estimates).

Insert Table 4 about here

Results

Validity of GMA and Cognitive Abilities to Predict Job Performance Ratings.

The meta-analytic results of the relationship between general mental ability and other cognitive abilities and job performance appear in Table 5. From left to right, the first two columns show the number of validity coefficients and the total sample size. The next four columns list the average observed validity weighted by the sample size, the observed variance weighted by the sample size, the sampling error variance, and the observed standard deviation weighted by the sample size. The following two columns show the operational validity (observed validity corrected by criterion reliability and predictor range restriction) and the standard deviation of the operational validity. The next two columns indicate the percentage of variance accounted for by artifactual errors (i.e., predictor and criterion reliability, predictor range restriction, and sampling error) and the 90% credibility value (i.e. minimum value expectable for the 90% of the coefficients included in the distribution). The operational validity was estimated as the observed validity corrected for criterion unreliability and range restriction.

A large number of studies were found for each ability examined, except memory, for which only 14 studies were found. Despite this, the number of studies is still an acceptable number of coefficients upon which to carry out a meta-analysis. With regard to this point, Ashworth, Osburn, Callender, and Boyle (1992) have developed a method for assessing the vulnerability of validity generalization results to unrepresented or missing studies. Ashworth et al. (1992) suggested calculating the effects on validity when 10% of studies are missing and their validity is zero. Therefore, we calculated additional estimates, to represent what the validity would be if we were unable to locate 10% of the studies carried out and if these studies showed 0 validity. The last three columns report these new estimates: lowest rho value, standard deviation and the 90% credibility value.

Insert Table 5 about here

The results for general mental ability were exceptionally good. The operational validity was .62, which means that GMA is an excellent predictor of job performance ratings. To our knowledge, as reported at present in the scientific literature, no meta-analyses for any single personnel selection procedure shows an operational validity of this magnitude. This finding confirms Schmidt and Hunter’s (1998) suggestion that GMA is the best predictor of job performance ratings. The 90% credibility value found in the European Community indicates that GMA has generalized validity across samples, occupations, measures, and European countries for the job performance criterion. With regard to the percentage of explained variance, we found that 75% of the observed variance is accounted for by the four artifactual errors considered here. This estimate is very close to the figure of 75% initially suggested by Schmidt and Hunter (1977), as the percentage of variance explained by artifactual errors in the case of cognitive tests. Moreover, this magnitude of explained variance suggests that the remaining variability may be accounted for by other artifactual errors not considered here (e.g. range restriction in the criterion, imperfect construct measurement of X and Y, typographical and clerical errors, and so on; see Hunter and Schmidt, (1990) for a list of these possible errors).

The second meta-analysis was carried out for verbal ability. Here, the results showed an operational validity of .35, which is remarkably lower than the operational validity of GMA. The 90% credibility value was .04 and the explained variance was 53%. These last two results indicate that validity may be moderated by other variables. The results may partially reflect the effects of an outlier with a large sample size. In effect, we found a study for managers, reporting a value of -.02 with a sample size of 437 individuals. When this coefficient is removed, the new operational validity is .39, the 90% CV is .11 and the explained variance is 62%. An anonymous reviewer suggested another explanation for the lower validity of verbal ability. Tests of verbal ability are in different languages across the countries included in the meta-analysis. Different language-based test complexity across countries may be responsible for these results. Other tests, such as numerical, spatial or perceptual, are probably less prone to this effect due to the better equivalence accross different languages than between tests of verbal ability in different languages. Because GMA is defined as the higher order factor to all specific abilities, the `equivalence´issue would affect GMA tests less than purely verbal tests.

For numerical ability, the operational validity was .52, and all the observed variance was explained by artifactual errors. These results indicate that numerical ability is a good predictor of job performance and that it has generalized validity across samples and countries. In this case, the predicted variance exceeds the observed variance due to second-order sampling error (Hunter and Schmidt, 1990) and it was rounded to 100% of the variance accounted for. The second-order sampling error is due to the fact that the available studies are not completely randomly distributed and, consequently, by chance the average validity and the standard deviation may differ slightly by some amount from the average effect size (and SD) for the entire research domain.

Spatial/mechanical ability was the next ability analyzed. We found an operational validity of .51. This figure shows that spatial/mechanical aptitude predicts job performance very well, to a similar extent as numerical ability. The 90% credibility value indicates that spatial/mechanical ability has generalized validity but its magnitude is small. This result, together with that of the percentage of explained variance (52%), suggests that some moderator may affect the operational validity.

Perceptual ability showed an operational validity of similar magnitude to those of numerical ability and spatial/mechanical ability. The operational validity was .52 and the 90% credibility value was .28. These two results indicate that perceptual ability is a good predictor of job performance and has generalized validity across samples, occupations and countries. The magnitude of the explained variance was 73%, similar to the one found for GMA, thus suggesting that the remaining variability may be due to other artifactual errors not considered here.

The results for memory suggest that this ability is the second best cognitive predictor of job performance. The operational validity found was .56 and all the variability was accounted for by artifactual errors. Therefore, memory demonstrated generalized validity across samples, occupations and countries. However, as in the case of numerical ability, there is evidence of second-order sampling error, suggesting that the number of studies located may not completely be representative of the total population. Additionally, it should be considered that the total sample size of this meta-analysis is the lowest, being only one-tenth of the size of the GMA sample. Nevertheless, Verive and McDaniel (1996) have also shown that memory tests are valid predictors of job performance.

As a whole, the results of the meta-analyses carried out for GMA and cognitive abilities for predicting job performance ratings showed that GMA is the best predictor and has generalized validity across samples, occupations and European Community countries. The other cognitive abilities also predict job performance, showing large operational validities, although lower than that of GMA. Finally, the magnitude of the operational validity for the European studies is remarkably larger than the magnitude reported in the American meta-analyses (e.g. Hartigan & Wigdor, 1989; Hunter & Hunter, 1984, Levine et al., 1996; Schmitt et al., 1984).

Despite the relevance of the latter findings, the overriding point is that our results suggest that GMA and specific cognitive abilities show international validity generalization for predicting job performance. In effect, our results indicate that GMA and cognitive abilities have generalized validity across the European Community countries, and the minimum magnitude of validity generalization is considerable, since the 90%CV for GMA was .37. This value is larger than the operational validity found for the majority of personnel selection procedures. For example, according to Schmidt and Hunter (1998), in American studies, work-sample tests and GMA tests are the predictors with the largest operational validity figures, showing an average validity of .54 and .51 respectively (see Salgado, 1999; Salgado, Ones, & Viswesvaran, 2001; and Schmidt & Hunter, 1998; for reviews). This finding, together with the validity generalization findings found in the American meta-analyses, strongly supports the hypothesis that there is international validity generalization for GMA in Europe and America. Our results also show that, for specific cognitive abilities, validity generalization is also evident across European Community countries, although to a lower magnitude.

Validity of GMA and Cognitive Abilities for Predicting Training Success.

The results of the meta-analyses of validity studies of GMA and cognitive abilities for predicting training success as criterion appear in Table 6. As in the case of the job performance criterion, the meta-analyses for training success were also carried out with a large number of studies, and large total sample sizes, in all cases.

Insert Table 6 about here

GMA showed an operational validity of .54 and was the best cognitive predictor for training success. This figure is the same as the one found by Hunter and Hunter (1984) for the USES database. Therefore, GMA showed that it is an excellent predictor of training success. The 90% CV was .29 and, consequently, GMA had generalized validity across samples, occupations and European Community countries.

Verbal ability showed an operational validity of .44 and a 90%CV of .20. These two findings indicate that verbal ability is a good predictor of training success and that its validity generalizes in the European Community. The percentage of explained variance is similar to that found for GMA. Numerical ability showed an operational validity of .48 and was the second best cognitive predictor of training success. According to its 90%CV, which was the second largest, numerical ability also demonstrated generalized validity across studies. Spatial/mechanical ability showed an operational validity equal to .40 and they also showed validity generalization since its 90%CV was .16. The four artifacts considered here accounted for 43% to 47% of the observed variance of these four abilities.

The last two cognitive abilities, perceptual and memory, showed the lowest operational validities. The validity for perceptual ability was .25, which is a value very similar to the figure found by Hunter and Hunter (1984) and lower than the value found by Hartigan and Wigdor (1989) and Levine et al. (1996). Furthermore, the 90%CV for perceptual ability was 0, indicating that the validity for training success does not generalize in the European Community. Memory showed an operational validity of .34 and its 90%CV was only .08. This last value indicates that the validity for memory tests generalizes to some extent. The explained variance for perceptual ability and memory was very similar, .34 and .35, respectively.

In summary, the results of the European Community meta-analysis of GMA and cognitive abilities as predictors of training success showed that GMA is the best cognitive predictor for this criterion. Its operational validity is similar to the value found in the American meta-analyses. The variance accounted for by artifactual errors is also similar in American and European studies. The results for perceptual ability are also similar in both continents. However, no comparisons are possible for verbal, numerical, spatial/mechanical, and memory.

Similarly to job performance, GMA has shown validity generalization for training success across the European Community countries and across the American studies. Therefore, the hypothesis of the international validity generalization is also well supported for this criterion. The international validity generalization is also supported for some specific abilities (i.e. verbal, numerical, and spatial/mechanical), although it is not supported for perceptual ability. In the case of memory, international validity generalization also exists but it is of a very small magnitude.

Discussion

The major finding of the present meta-analytic investigation is that the criterion-related validity of GMA tests in the USA and the EC is notably similar for different criteria, and that the reported validity coefficients generalize internationally. This has important implications for the use of such tests in employee selection on both continents and also for the on-going debate between validity generalization and situational specificity alluded to earlier in this paper. As evidenced here, validity generalized internationally for GMA and cognitive ability tests for predicting job performance and training success, disconfirming the situational specificity hypothesis.

Validity Generalization of GMA and Specific Cognitive Abilities

The European Community data set collected here contained validity studies carried out in ten member countries of the European Community, all conducted using civilian samples. Regarding the operational validity of GMA, we found it to be .62 for predicting job performance and .54 for predicting training success. These values are some of the largest reported in the meta-analytic literature of personnel selection. These results lead us to conclude that, in comparison with other personnel selection procedures, GMA measures are the best predictors of job performance and training success in the European Community, regardless of the country in question (Barrick, Mount, & Judge, 2001; Salgado, 1997, 2002; Salgado & Anderson, 2002; Salgado, Viswesvaran, & Ones, 2001). The validity of the specific cognitive abilities was also large for both criteria. In the case of job performance, validity ranged from .35 (.39 if we excluded the possible outlier) for verbal ability to .56 for memory. For training success, validity ranged from .25 for perceptual ability to .48 for numerical ability. Consequently, the second conclusion is that specific cognitive abilities are good predictors for job performance and training success, although the magnitude of their validities is lower than that of GMA.

Regarding the validity generalization of GMA and specific cognitive abilities, our findings fully support the hypothesis that these measures have generalized validity across settings. For job performance, the 90%CV of GMA was .37, a very large value, and the 90%CV’s for numerical ability and memory tests were even larger, .52 and .56, respectively. However, these last two estimates may have been affected by second-order sampling error, in which case they could vary slightly. Similarly, for training success, all cognitive measures also showed validity generalization with the exception of perceptual ability. The 90%CV’s ranged from .29 for GMA to .08 for memory. The 90%CV for perceptual ability was 0 and, consequently, measures of perceptual ability did not show validity generalization.

These results together with the American results, they suggest our third and more relevant and robust conclusion: There is international validity generalization for GMA and specific cognitive abilities for predicting job performance and training success. In other words, the criterion validity of cognitive measures generalizes across different conceptualizations of job performance and training, differences in unemployment rates, differences in tests used, and differences in cultural values, demographics, and languages. The only exception found within the current investigation was for perceptual ability tests as predictors of training success.

Similarities and Differences Between American and European Community Findings

It is also of interest to comment on the similarities and differences between the American and the European Community findings. There is a very close agreement for training success, as the magnitude of the operational validity is almost similar in both continents. There is also a similarity between our findings and the results of a recent meta-analysis by Kuncel, Hezlett and Ones (2001) on the validity of cognitive abilities for predicting educational criteria. Specifically, the results in both meta-analyses are similar when we compare our results for training success with Kuncel et al’ results for the faculty rating criterion. For example, we found an operational validity of .44 and .48 for verbal and numerical abilities, respectively, and Kuncel et al., (2001) found an operational validity of .42 and .47 for the same abilities.

However, there are important differences in the case of job performance. The magnitude of the validity for GMA in the Europe Community is larger than the validity found in the American studies. Furthermore, GMA predicts job performance better than training success in the European Community, while GMA predicts training success better than job performance in the USA. These differences suggest that job performance may be predicted differentially in the European Community compared to the USA. A possible explanation is that training success is assessed as typical performance in the EC while it is assessed as maximal performance in the US (Ackerman, 1994; Ackerman & Humphryes, 1991). Also, it is possible that objective measures of training success and ratings are assessing different constructs. In the case of the European studies, as the majority of them used ratings, the construct assessed may be job performance while in the case of the USA studies the construct examined could be job knowledge (i.e. declarative knowledge). A third possible explanation is of methodological order. Many European studies have used a dichotomous criterion for assessing training. This is an artifactual error that produces an underestimation of the real validity. The magnitude of the underestimation depends on the proportion of cases in the successful and unsuccessful groups. For example, if the proportion is .50, then the understimation would produce a figure which is 80% of the real validity. However, the underestimation could be as dramatic as only producing 25% of the real value. Another explanation for these different findings is that the data set consisted of different proportions of studies containing more complex jobs in the European Community than in the USA.

Other relevant similarities were found for job performance reliability and the range restriction of scores in the cognitive predictors. We found that the average inter-rater reliability for job performance ratings was .52. This was exactly the same estimate found in the large-scale meta-analysis carried out by Viswesvaran et al. (1996), using studies mainly conducted in the USA. This convergence in the inter-rater reliability for job performance ratings shows that reliability is independent of the type of scale and is not nationally limited. Our results, as also shown by Viswesvaran et al. (1996), support the value used by Hunter and Hunter (1984) in their meta-analysis of the GATB validity and contradicts the position of the National Science Council Panel (Hartigan and Wigdor, 1989), when this last considered that .60 (i.e. Hunter's value) was an unrealistic value for job performance ratings. In fact, our results suggest that Hunter's estimate was a conservative one. Another similarity was found regarding the magnitude of the range restriction estimates. Hunter (1986; Hunter and Hunter, 1984) used a value of .67 as the range restriction ratio of GMA scores in studies carried out in civil occupations when predicting job performance and .60 for studies using training success. In our studies we found a range restriction ratio of .62 for studies using job performance and .67 for studies using training success. Therefore, there is a small difference between Hunter's estimates and our estimates.

Implications for the Theory and Practice of Personnel Selection

The findings of this research have important implications for the theory and practice of personnel selection. Crucially, our findings contradict the view that criterion-related validity for GMA tests is moderated by country culture, religion, language, socio-economic, or employment legislation differences (e.g. Herriot & Anderson, 1997; Newell & Tansley, 2001). At least across EC countries, these results demonstrate unequivocal international generalizability for cognitive ability tests. By implication, these findings hint at the scientific feasibility of a general theory of personnel selection, applicable at least to the USA and EC countries and probably all over the world. In such a theory, GMA would have a core position, as the magnitude of its validity appears to be the largest of all personnel selection procedures. However, caution is warranted toward claiming wider, global generalizability, as Europe and the US do share aspects of cultural similarity, which may not be equally shared by all cultures.

From a practical point of view, our findings scientifically support the use of GMA and cognitive measures for selection purposes for all kinds of occupations in the EC and USA. The findings also suggest that the evidence from European surveys regarding the wide use of cognitive measures in personnel selection for all jobs in all countries has a scientific basis. There is now unequivocal evidence to indicate that GMA tests are very good predictors of job performance and training success across the US and the EC, and therefore, the transfer of such findings to organizational practices in employee selection is important (Anderson, Herriot & Hodgkinson, 2001). While some selection practitioners may believe that tests of specific cognitive abilities produce superior criterion-related validity than GMA, our results refute this notion. That is, tests of specific abilities such as verbal, numerical, spatial/mechanical, perceptual and memory, failed to demonstrate higher validity than GMA measures. It is thus prudent to reiterate the main practical implication of this finding that GMA tests predicted these two criteria most successfully.

Some Limitations of the Present Research

This research has some limitations. Firstly, only two criteria were considered, yet job activity is a multi-dimensional construct, which incorporates both actions and omissions related to work goals. In this sense, job performance and training success are only facets of job activity, but job activity includes both productive actions (e.g. job performance, training success, outputs, career advancement, citizen behavior, knowledge sharing) and counter-productive actions (e.g. thefts, accidents, absences, norm violations, etc) (see Campbell, McHenry, & Wise, 1990; Sackett, 2002; Viswesvaran, 2002; Viswesvaran & Ones, 2000). Another limitation is related to possible group differences in GMA. In the European Community context, scarce research has addressed this issue. However, some studies have suggested that there is evidence of no differential prediction for minority groups (te Nijenhuis & van der Flier, 1997). A third limitation is that we have not examined whether the criterion-related validity of GMA and specific cognitive abilities is moderated by occupation type. In spite of these three main limitations noted, the present research focused upon, as an initial investigation, somewhat more restricted questions into the international validity generalization of GMA tests in personnel selection in the countries of the European Community. A call for future research to address these limitations is therefore warranted. In addition, similar meta-analytic validity studies should also be conducted on data sets obtained from Africa and Asia, and the American countries not represented in the US meta-analyses.

Authors’ note

The authors wish to thank three anonymous reviewers for their comments on a previous version of this article. The preparation of this manuscript was supported by the Ministerio de Ciencia y Tecnología grant no. BSO2001-3070 to Jesús F. Salgado, and by a review grant from the British Army Research Establishment to Neil Anderson. Correspondence should be sent to Jesús F. Salgado, Departamento de Psicología Social, Universidad de Santiago de Compostela, 15782 Santiago de Compostela, Spain (E-mail: psjesal@usc.es)

References

References marked with an asterisk indicate studies included in the meta-analysis.

Ackerman, P.L. (1994). Intelligence, attention, and learning: Maximal and typical performance. In D.K. Ditterman (Ed.), Current topics in human intelligence: Vol 4: Theories of intelligence (pp. 1-27). Norwood, NJ: Ablex

Ackerman, P.L. & Humphreys, L.G. (1991). Individual differences theory in industrial and organizational psychology. In M.D. Dunnette & L.M. Hough (Eds). Handbook of Industrial and Organizational Psychology. Vol 1. (pp. 223-282). Palo Alto, CA: Consulting Psychologists Press.

*Amthauer, R. (1973). I-S-T 70. Intelligenz-Struktur-Test. I-S-T 70. [Intelligence-Strtucture-Test]. Gottigen: Hogrefe.

*Amthauer, R., Brocke, B., Liepmann, D. & Beauducel, A. (1999). I-S-T 2000. Inttelligenz-Struktur-Test 2000. [I-S-T- 2000. Intelligence-Structure-Test 2000]. Gottingen: Hogrefe.

*Anderberg, R. (1936). Psychoteschnische rekutierungmethoden bei den schwedischen staasbahnen. [Psychotenichal methods of recruitment in the Swedish railways]. Industrielle Psychotechnike, 13, 353-383.

Anderson, N., Herriot, P., & Hodgkinson, G.P. (2001). The Practitioner-researcher divide in Industrial, Work and Organizational (IWO) Psychology: Where are we now, and where do we go from here. Journal of Occupational and Organizational Psychology, 74, 391-441.

*Anstey, E. (1963). D-48. Manual. Madrid: Tea.

Ashworth, S.D., Osburn, H.G., Callender, J.C., & Boyle, K.A. (1992). The effects of unrepresented studies on the robustness of validity generalization results. Personnel Psychology, 45, 341-361.

*Bacqueyrisse, L. (1935). Psychological tests in the Paris tramway and omnibus services. Human Factor, 9, 231-234.

*Banister, D., Slater, P., & Radzan, M. (1962). The use of cognitive tests in nursing candidate selection. Occupational Psychologist, 36, 75-78.

Barrick, M.R., Mount, M.K., & Judge, T. (2001). Personality and performance at the beginning of the new millennium: What do we know and where do we go next. International Journal of Selection and Assessment, 9, 9-30.

*Bartram, D., & Dale, H.C.A. (1982). The Eysenck Personality Inventory as a selection test for military pilots. Journal of Occupational Psychology, 55, 287-296.

*Beaton, R. (1982). Use of tests in a national retail company. In K. Miller (Ed.). Psychological testing in personnel assessment (pp. 137-148). Epping, UK: Gower Press.

*Blanco, M.J. y Salgado, J.F. (1992). Diseño y experimentación de un modelo de intervención de la psicología en el reconocimiento y selección de conductores profesionales. [Design and experimentation of a intervention spcyhological model in diagnosis and selection of profesional drivers]. In AA. VV. (Eds). Conducta y Seguridad Vial. (5-57). Madrid: Fundación Mapfre.

*Boerkamp, R.G. (1974). Een criteriuminstrument voor de bedrijfsselectie-situatie. [A criterion measure for applicant-selection in companies]. Unpublished Doctoral Dissertation, University of Amsterdam, The Netherlands.

*Bonnardel, R. (1949). L’emploi des méthodes psychométriques pour le controle des condictions psychologiques du travail dans les ateliers [Use of psychometric methods for controling the psychological condictions of work in the ateliers]. Le Travail Humain, 12, 75-85.

*Bonnardel, R. (1949). Examens psychometriques et promotion ouvriere (Étude portant sur un groupe d’ouvriers électiciens en cours de perfectionnement) [Psychometric tests and worker promotion (A study conducted with a group of electrician workers in a training course]. Travail Humain, 12, 113-117

*Bonnardel, R. (1949). Recherche sur la promotion des ouvriers dans le cadres de maitrise [Research on the promotion of workers to industry supervisor]. Le Travail Humain, 12, 245-256.

*Bonnardel, R. (1954). Appréciations profesionnelles et notations psychométriques étude portant sur un groupe de jeunes ouvriers [Profesionnal qualifications and psychometric scores. Study conducted with a group of young workers]. Le Travail Humain, 17, 119-125.

*Bonnardel, R. (1954). Examen de chauffeurs de caminos au moyen de tests de réactions [Examination of train drivers using reaction tests] . Le Travail Humain, 17, 272-281.

*Bonnardel, R. (1956). Un example des difficultés soulevées par la question des critéres professionnels [An example of the difficulties raised for the question of professional criteria]. Le Travail Humain, 19, 234-237.

*Bonnardel, R. (1958). Comparaison d’examens psychométriques de jeunes ingenieurs et de cadres administratifs [Comparison between psychometric test in young engineers and administrative supervisors]. Le Travail Humain, 21, 248-253.

*Bonnardel, R. (1959). Liaisons entre résultats d´examens psychométriques et réussite dans le perfectionnement profesionnel ouvrier [Relations between the results of psychometric examns and performance in the improvement of professional worker]. Le Travail Humain, 22, 239-246.

*Bonnardel, R. (1961). Recherche sur la promotion des ouvriers dans la maitrise [Research on the promotion of workers in industry]. Le Travail Humain, 24, 21-34.

*Bonnardel, R. (1962). Recherche sur le recruitment a des cours d’étude du travail au moyen de techniques psychométriques [Research on the recruitment by psychometric techniches]. Le Travail Humain, 25, 73-78.

Borman, W., Penner, L.A., Allen, T.D., & Motowidlo, S.J. (2001). Personality predictors of citizenship performance. International Journal of Selection and Assessment, 9, 52-69.

*Bruyere, M.J. (1937). Quelques données sur l’intelligence logico-verbale et les aptitudes techniques pour l’orientation vers la carrier d’ingenieur. Bulletin de l’Institut Nationalle d’Orientation Professionelle, 9, 141-147.

*Burt, C. (1922. Test for clerical occupations. Journal of the National Institute of Industrial Psychology, 1, 23-27 &79-82.

Campbell, J.P., McHenry, J.J. & Wise, L.L. (1990). Modeling job performance in a population of jobs. Personnel Psychology, 43, 313-333.

*Campos, F. (1947). Selección de aprendices. [Apprentice selection] Revista de Psicología General y Aplicada, 2,

Carretta, T.R. & Ree, M.J. (1997). Expading the nexus of cognitive and psychomotor abilities. International Journal of Selection and Assessment, 5, 149-158.

Carretta, T. & Ree, M.J. (2000). General and specific cognitive and psychomotor abilities in personnel selection: The prediction of training and job performance. International Journal of Selection and Assessment, 8, 227-236.

Carreta, T. & Ree, M.J. (2001). Pitfalls of ability research. International Journal of Selection and Assessment, 9, 325-335.

*Castillo-Martin, J. (1963). Estudio del Otis en sujetos de la industria eléctrica. [A study of Otis test with electrical industry workers]. Revista de Psicología General y Aplicad, 18, 1015-1020.

*Castillo, J., Bueno, R., Moldes, F., Fernández, J., & Barras, M. (1969). Estudio estadístico del test de matrices progresivas de Raven, escalas general y especial. [A statistical study of Raven’s Progressive Matrices Tests, general and special scales]. Revista de Psicología General y Aplicada, 24, 1004-1009.

*Castle, P.F.C., & Garforth, F.I. (1951). Selection, training and status of supervisors. Occupational Psychologist, 25, 109-123.

*Cattell, R.B. (1981). Manual del test Factor G. Escalas 2 y 3. [Manual for the Factor G Test, Scales 2 and 3]. Madrid: Tea.

*Childs, R. (1990). Graduate and managerial assessment data suplement. Windsor, Berkshire, UK: ASE, NFER-Nelson

*Chleusebairgue, A. (1939). Industrial psychology in Spain. Occupational Psychology, 13, 33-41.

*Chleusebairgue, A. (1940). The selection of drivers in Barcelona. Occupational Psychology, 14, 146-161.

*Cleary, T.A. (1968). Test Bias: Prediction of grades of negro and white students in integrated colleges. Journal of Educational Measurement, 5, 115-124.

*de Buvry de Mauregnault, D. (1998). Evaluatie selectieprocedure automatiseringsmedewerkers, validatieonderzoek aan de hand van task performance en contextual performance. [Evaluation of the selection procedure of automation employees: validation research of task and contextual performance]. Major paper of Work and Organizational Psychology, University of Amsterdam.

*Decroly, O. (1926). Etude sur les aptitudes necessaires au relieur. [Study on the aptitudes for worker]. Bulletin de l’oficince d’Orientation Profesionnelle, 26, 1-31.

*Farmer, E. (1930). A note on the relation of certain aspects of character to industrial proficiency. British Journal of Psychology, 21, 46-49.

*Evers, A. (1977). De constructie van een testbatereij voor de selectie van leerling-verplegenden aan de `De Tjonsgerschans´. [The construction of a test battery for the selection of student curses in De Tjonsgerschans]. Unpublished first report.

*Farmer, E. (1933). The reliability of the criteria used for assessing the value of vocational tests. Bristish Journal of Psychology, 24, 109-119.

*Farmer, E. & Chambers, E.G. (1936). The prognostic value of some psychological tests. Industrial Health Research Board report no. 74. London: HM Stationary

*Feltham, R. (1988). Validity of a police assessment centre: A 1-19 year follow-up. Journal of Occupational Psychology, 61, 129-144.

*Fokkema, S.D. (1958). Aviation psychology in The Netherlands. In J.D.Parry & S.D. Fokkema (Eds.). Aviation psychology in western-Europe and a report on studies of pilot proficiency measurement (pp 58-69). Amsterdan: Swetz & Zeitlinger.

*Foster, W.R. (1924). Vocational selection in a chocolate factory. Journal of the National Institute of Industrial Psychology, 3, 159-163.

*Frankford, A.A. (1932). Intelligence tests in nursing. Human Factor, 6, 451-453.

*Frisby, C.B. (1962). The use of cogntive tests in nursing candidate selection: a comment. Occupational Psychology, 36, 79-81.

*García-Izquierdo, A. (1998). Validación de un proceso de selección de personal en un centro de formación del sector de la construcción. [Validity of a personnel selection process ina training center of the construction industry]. Unpublished doctoral disertation, University of Oviedo, Spain..

*Germain, J., Pinillos, J.L., Garcia-Moreno, E., & Aberasturi, N.L. (1969). La validez de unas pruebas selectivas para conductores. [Validity of drivers’ selection tests]. Revista de Psicología General y Aplicada, 24, 1067-1114.

*Goguelin, P. (1950). Recherches sur la sélection des conducteurs de véhicules. Le Travail Humain, 13, 9-35.

*Goguelin, P. (1951). Étude du poste délectricien de tableau et examen de selection pour le poste. Le Travail Humain, 14, 15-65.

*Goguelin, P. (1953). Étude du poste de dispatcher dans l’industrie eléctrique et de la selection pour ce poste. Le Travail Humain, 16, 197-205.

*Gonzalez-Peiró, A.,Sanz-Cid, A.,& Ferrer-Martin, A. (1984). Prevención de accidentes mediante una bateria psicofisiológica. [Prevention of accidents using a psychophisiological battery]. I Internacional Meeting on Psychology and Traffic Security, Valencia, Spain.

*Greuter, M.A.M., Smit-Voskuyl, O.F. (1983). Psychologish onderzoek geëvalueeerd, een valideringsonderzoek naar de psychologische selectie bij de VNG. [Psychological assessment investigated: A validation study of the psychological selection process in VNG]. Research report, University of Amsterdam.

*Handyside, J. D., & Duncan, D.C. (1954). Four Years Later: A follow-up of an experiment in selecting supervisors. Occupational Psychology, 28, 9-23.

*Hänsgen, K.D. (2000). Evaluation des eignungstests für das medizinstudium in der schweis-suverlässigkeit der vorhersage von studienerfolg. [Evaluation of aptitude test for Medicine studies in Swizertland. Prognosis reliability of the studies success]. CTD Centre for the test development and the diagnostic., department of Psychology, University of Fribourg, Switzeland.

Hartigan & Wigdor, A.K. (1989) (Eds). Fairness in employment testing. Validity generalization, minority issues, and the General Aptitude Test Battery. Washington, D.C.: National Academy Press.

*Hebling, H. (1987). The self in carrer development, theory, measurement and counseling. Unpublished doctoral dissertation. University of Amsterdam.

*Hebling, J.C. (1964). Results of the selection of R.N.A.F. air traffic and fighter control officers. In A.Cassie, S.D. Fokkema & J.B. Parry (Eds.). Aviation psychology. Studies on accident liability, proficiency criteria, and personnel selection (pp 71-76). The Hague: Mouton.

* Heim, A.W. (1946). An attempt to test high-grade intelligence. British Journal of Psychology, 37-38, 71-81.

*Henderson, P., & Boohan, M. (1987). SHL’s technical test battery: Validation. Guidance and Assessment Review, 3 (6), 3-4.

*Henderson, P., & Hopper, F. (1987). SHL’s TTB: Development of norms. Guidance and Assessment Review, 3 (6), 5-6.

*Henderson, P., Lockhart, H., & O’Reilly, S. (1987). SHL’s technical test battery: Test-retest reliability. Guidance and Assessment Review, 3, (3), 4-5.

*Henderson, P. & O’Hara, R. (1990). Norming and validation of SHL’s personnel test battery in Northern Ireland. Guidance and Assessment Review, 6 (4), 2-3.

Herriot, P. & Anderson, N.R. (1997). Selecting for change: How will personnel selection psychology survive?. In N. Anderson and P. Herriot (Eds.), International handbook of selection and assessment (pp 1-34). London, UK: Wiley.

Hofstede, G. (1980). Culture’s consequences: International differences in work-related values. Beverly Hills, CA: Sage.

Hough, L.M., Oswald, F.L., & Ployhart, R.E. (2001). Determinants, detection, and amelioration of adverse impact in personnel selection procedures: Issues, evidence, and lessons learned. International Journal of Selection and Assessment, 9, 152-193

Hunter, J.E. (1986). Cognitive ability, Cognitive aptitudes, job knowledge, and job performance. Intelligence, 29, 340-362.

Hunter, J.E. & Hirsh, H.R. (1987). Applications of meta-analysis. International Review of Industrial and Organizational Psychology, Vol 2. (pp321-357). Chichester, UK: Wiley.

Hunter, J.E. & Hunter, R.F. (1984). Validity and utility of alternative predictors of job performance. Psychological Bulletin, 96, 72-98.

Hunter, J.E. & Schmidt, F.L. (1990). Methods of meta-analysis. Correcting error and bias in research findings. Newbury Park, CA: Sage.

*Jäger, A.O. & Althoff, K. (1994). Der WILDE-Intelligenz Test (WIT). [WILDE Intelligence Test]. Gottigen: Hogrefe.

*James, D.J. (1964). Prediction of performance in the early stages of flying-training. In A.Cassie, S.D. Fokkema & J.B. Parry (Eds.). Aviation psychology. Studies on accident liability, proficiency criteria, and personnel selection (pp 78-82). The Hague: Mouton.

*Jones, E.S. (1917). The Woolley-test series applied to the detection of ability in telegraphy. Journal of Educational Psychology, 8, 27-34.

Kehoe, J. (2002). General mental ability and selection in private sector organizations: a commentary. Human Performance, 15, 97-106.

*Keizer, L. & Elshout, J.J. (1969). Een onderzoek naar de validiteit van selectierapporten en enkele andere voorspellers. [A study of the validity of selection protocols and other predictors]. Unpublished Report.

*Kokorian, A. & Valser, C. (1999). Computer-based assessment for aircraft pilot: The pilot atitude tester (PILAPT). 41 Annual Conference of the International Military Testing Association (IMTA). Monterey, CA.

* Kragh, U. (1960). The defense mechanism test: a new method for diagnosis and personnel selection. Journal of Applied Psychology, 44, 303-309.

*Krielen, F. (1975). De validiteit van de testbatterijen `Middelbaar Algemeen´ en `Middelbaar Administratief´. [The validity of the test batteries `Middelbaar Algemeen´ and `Middelbaar Administratief´]. Unpublished doctoral dissertation. University of Amsterdam.

Kuncel, N., Hezlett, S.A., & Ones, D.S. (2001). A comprehensive meta-analysis of the predictive validity of the Graduate record Examinations: Implications for graduate student selection and performance. Psychological Bulletin, 127, 162-181.

*Lahy, J.M. (1933). Sur la validité des tests exprimée en « pourcent » d’échecs. Le Travail Humain, 1, 24-31.

*Lahy, J.M. (1934). La sélection profesionnelle des aiguilleurs. Le Travail Humain, 2, 15-38.

*Lahy, J.M. & Korngold, S. (1936). Recherches experimentales sur les causes Psychologiques des accidents du travail. Le Travail Humain, 4, 21-59.

*Lahy, J.M. & Korngold, S. (1936). Sélection des opératrices des machines a perforer « samas » et « hollerith ». Le Travail Humain, 4, 280-290.

*Ledent, R. & Wellens, L. (1935). La selection et la surveillance des conducteurs des tramsways unifiés de Liége et extensions. Le Travail Humain, 3, 401-405.

Levine, E.L., Spector, P.E., Menon, Narayanan, & Cannon-Bowers, J. (1996). Validity generalization for cognitive, psychomotor, and perceptual tests for craft jobs in the utility industry. Human Performance, 9, 1-22

Levy-Leboyer, C. (1994). Selection and assessment in Europe. In H.C. Triandis, M.D. Dunnette & L.M. Hough (Eds). Handbook of Industrial and Organizational Psychology. Vol 4. (pp. 173-190). Palo Alto, CA: Consulting Psychologists Press.

*Mace, C.A. (1950). The human problems of the building industry: Guidance, selection and training. Occupational Psychology, 24, 96-104.

*McKenna, F.P., Duncan, J., Brown, I.D. (1986). Cognitive abilities and safety on the road: a re-examination of individual differences in dichotic listening and search for embedded figures. Ergonomics, 29, 649-663.

*Meili, R. (1951). Experiences et observations sur l’intelligence pratique-technique [Experiences and observations on technical-practical intelligence]. L´Année Psychologique, 50, 557-574.

*Miles, G.H. (1924-25). Economy and safety in trasport. Journal of the Natitional Institute of Industrial Psychology, 2, 192-197.

*Miller, K. (1982). Choosing tests for clerical selection. In K. Miller (Ed.). Psychological testing in personnel assessment (pp. 109-121). Epping, UK: Gower Press.

*Miret y Alsina, F.J. (1964). Aviation psychology in the Sabena. In J.D.Parry & S.D. Fokkema (Eds.). Aviation psychology in western-Europe and a report on studies of pilot proficiency measurement (pp 22-30). Amsterdan: Swetz & Zeitlinger.

*Mira, E. (1922-23). La selecció del xòfers de la companyia general d’autòmnibus. [Selection of drivers for the general company of buses]. Annals de L´Institut d’Orientació Professional, 3-4, 60-71.

*Montgomery, G.W.G. (1962). Predicting success in engineering. Occupational Psychologist, 36, 59-80.

* Moran, A. (1986). The reliability and validity of Raven’s standard progressive matrices for Irish apprentices. International Review of Applied Psychology, 35, 533-538.

*Morin, J. (1954). Notation psychométrique et profesionnelle de jeunes ingénieurs. Le Travail Humain, 20, 201-211.

*Mulder, F. (1974). Characteristics of violators of formal company rules. Journal of Applied Psychology, 55, 500-502.

*Munro, M.S. & Raphael, W. (1930). Selection tests for clerical occupations. Journal of the National Institute of Industrial Psychology, 5, 127-137.

Murphy, K.R. (2002). Can conflicting perspectives on the role of g in personnel selection be resolved? Human Performance, 15, 173-186.

Murphy, K.R. & Shiarella, A. (1997). Implications of the multidimensional nature of job performance for the validity of selection tests: Multivariate frameworks for studying test validity. Personnel Psychology, 50, 823-854.

*National Institute of Industrial Psychology (1932). The selection of salesmen. Human Factor, 6, 26-29.

*Nelson, A., Robertson, I.T., Walley, L. & Smith, M. (1998). Personality and work performance: some evidence from small and medium-sized firms. Occupational Psychologist, 12, 28-36.

*Neuman, E. (1938). Psychoteschnische Eignungsprüfung und Anlernung im Flugmotorenbau. [Psychotechnic aptitude tests and training in airplane motors]. Industrielle Psychotechnik, 15, 111-162.

Newell, S. & Tansley, C. (2001). International uses of selection methods. In C.L. Cooper & I.T. Robertson (eds) International Review of Industrial and Organizational Psychology, vol. 21, pp. (195-213). Chichester, UK: Wiley.

*Newman, S. H., & Howell, M. A. (1961). Validity of forced choice items for obtaining references on physicians. Psychological Reports, 8, 367.

*Notenboom, C.G.M. & van Leest, P.F. (1993). Psychologische selectie werkt! [The psychological selection works!]. Tijdschrift voor de Politie, 3, 65-67.

Nunnaly, J. (1978). Psychometric theory. 2nd edition. New York: McGraw-Hill.

*Nyfield, G., Gibbons, P.J, Baron, H., & Robertson, I. (1995). The cross-cultural validity of management assessment methods. Paper Presented at the 10th Annual SIOP Conference, May, 1995. Orlando, USA.

Olea, M.M. & Ree, M.J. (1994). Predicting pilot and navigator criteria: Not much more than g. Journal of Applied Psychology, 79, 845-951.

*Pacaud, S. (1946). Recherches sur la sélection psychotechnique des agents de gare dits « facteurs-enregistrants ». Le Travail Humain, 9, 23-73.

*Pacaud, S. (1946). Recherches sur la selection profesionnelle des opératrices de machines a perforer et de machines comptables. La « seléction omnibus » est-elle possible en mecanographie? . Le Travail Humain, 9, 74-86.

*Pacaud, S. (1947). Sélection des mécaniciens et des chaufeurs de locomotive. Etude de la validité des tests employés et composition des batteries selectives. Le Travail Humain, 10, 180-253.

*Patin, J. & Vinatier, H. (1962). Un test verbal: Le CERP 15. Bulletin du CERP, 11, 327-342.

*Patin, J. & Vinatier, H. (1963). Un test d’intelligence : Le CERP 2. Bulletin du CERP, 12, 187-201.

*Patin, J. & Vinatier, H. (1963). Le test CERP 14 du docteur Morali-Daninos. Bulletin du CERP, 12, 381-390.

*Patin, J. & Nodiot, S. (1962). Le test mecaniques de P. Rennes. Bulletin du CERP, 11, 61-97.

*Patin, J. & Panchout, M.F. (1964). L’admission des candidats à une formation profesionelle de techniciens. Approche docimologique, validités du concours et de l’examen psychotechnique. Bulletin du C.E.R.P., 13, 147-165.

Pearlman, K., Schmidt, F.L., & Hunter, J.E. (1980). Validity generalization results for tests used to predict job proficiency and training success in clerical occupations. Journal of Applied Psychology, 65, 373-406.

*Pérez-Herrera, J. & Perez-Muñoz, J. (1984). Validez de la batería Belrampa. [Validity of Belrampa Battery]. II Congreso del Psicología del Trabajo. Valencia, Spain.

*Petrie, A., & Powell, M.B. (1951). The selection of nurses in England. Journal of Applied Psychology, 35, 281-286.

Ree, M.J. & Carretta, T.R. (1994). The correlation of general cognitive ability and psychomotor tracking tests. International Journal of Selection and Assessment, 2, 209-216.

Ree, M.J. & Earles, J.A. (1991). Predicting training success: Not much more than g. Personnel Psychology, 44, 321-332.

Ree, M.J., Earles, J.A., & Teachout, M.S. (1994). Predicting job performance: Not much more than g. Journal of Applied Psychology, 79, 518-524.

*Roe, R.A. (1973). Een onderzoek naar de validiteit van een selectieprocedure voor lager en uitgebreid lager administratieve functies. [A validity study of a selection procedure for lower level administrative jobs]. Unpublished report. University of Amsterdam.

Roe, R.A. (1989). Designing selection procedures. In P. Herriot (ed.), Assessment and selection in organizations (pp. Xx-xx) . Chichester, UK: Wiley.

*Roe, R.A. & Boerkamp, R.G. (1971). De selectie vooor functies uit de groep Uitgebreid Algemen. Een validatiestudie. [The selection for functions of the cluster `Uitgebreid Algemen´]. Technical report, University of Amsterdam, The Netherlands.

*Roloff, H.P. (1928). Ueber eignung und bewährung. [On aptitude and validity]. Beihefte zur Zeistchrift für Angewande Psychologie, 148. (Abstract in L’Année Psychologique, 1928, 29, 897-899).

*Ross, J. (1962). Predicting practical skill in engineering apprentices. Occupational Psychologist, 36, 69-74.

Rothstein, H.R. (1990). Interrater reliability of job performance ratings: Growth to asymptote level with increasing opportunity to observe. Journal of Applied Psychology, 75, 322-327.

Ryan, A.M., MacFarland, L., Baron, H. & Page, R. (1999). An international look at selection pratiques: Nation and culture as explanations for variability in practice. Personnel Psychology, 52, 359-391.

Sackett, P.R. (2002). The structure of counterproductive work behaviors: Dimensionality and relationships with factes of job performance. International Journal of Selection and Assessment, 10, 5-11.

*Salgado, J.F. (1993). Validación sintética y utilidad de pruebas de habilidades cognitivas por ordenador. [Synthetic validity and utility of computerized cognitive ability tests]. Revista de Psicología del Trabajo y las Organizaciones, 11, 79-92.

Salgado, J.F. (1997). The Five Factor Model of personality and job performance in the European Community. Journal of Applied Psychology, 82, 30-43.

Salgado, J.F. (1999). Personnel selection methods. In C.L. Cooper & I.T. Robertson (Eds). International Review of Industrial and Organizational Psycholoogy. Vol. 14 (pp 1-53). Chichester, Uk: Wiley.

Salgado, J.F. (2002). The Big Five personality dimensions and counterproductive behaviors. International Journal of Selection and Assessment, 10, 117-125.

Salgado, J.F. & Anderson, N. (2002). Cognitive and GMA testing in the European Community: Issues and Evidence. Human Performance, 15, 75-96.

*Salgado, J.F. y Blanco, M.J. (1990). Validez de las pruebas de aptitudes cognitivas en la selección de oficiales de mantenimiento de la Universidad de Santiago de Compostela. [Validity of cognitive ability tests for selecting maintainance worker in the Univerrsity of Santiago de Compostela]. Paper presented at the III National Congreso of Social Psychology, Santiago de Compostela, Spain.

Salgado, J.F., Viswesvaran, C., & Ones, D.S. (2001). Predictors used for personnel selection. In N.Anderson, D.S. Ones, H.K. Sinangil, & C. Viswesvaran (Eds). Handbook of Industrial, Work, & Organizational Psychology. Vol 1. (pp. 165-199). London, UK: sage.

*Samuel, J.A. (1970). The effectiveness of aptitude tests in the selection of postmen. Studies in Personnel Psyhology, 2, 65-73.

*Sanchez, J. (1969). Validación de tests con monitores de formación [Test validity with training monittors]. Revista de Psicología General y Aplicada, 24, 301-313.

*Schmidt-Atzert, L. & Deter, B. (1993). Intelligenz und ausbildungserfolg: Eine untersuchung Zur prognostischen validität des I-S-T- 70. [Intelligence and training success: A study of the predictive validity of the I-S-T 70]. Zeitschrift für Arbeits und Organisationspsychologie, 37, 52-63.

*Schmidt-Atzert, L. & Deter, B. (1993). Die vorhersage des ausbildungserfolgs bei verscheidenen berufsgruppen durch leistungstets. [The prediction of training success in different jobs by achievement tests]. Zeitschrift für Arbeits und Organisationspsychologie, 37, 191-196.

Schmidt, F.L. (2002). The role of general cogntive ability and job performance: Why there cannot be a debate. Human Performance, 15, 187-211.

Schmidt, F.L. & Hunter, J.E. (1977). Development of a general solution to the problem of validity generalization. Journal of Applied Psychology, 62, 529-540.

Schmidt, F.L. & Hunter, J.E. (1996). Measurement error in psychological research: Lessons from 26 research scenarios. Psychological Methods, 1, 199-223.

Schmidt, F.L. & Hunter, J.E. (1998). The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings. Psychological Bulletin, 124, 262-274

Schmidt, F.L. & Hunter, J.E. (1999). Theory testing and measurement error. Intelligence, 27, 183-198

Schmidt, F.L., Hunter, J.E., & Caplan, J.R. (1981). Validity generalization results for two jobs groups in the petroleum industry. Journal of Applied Psychology, 66, 261-273.

Schmidt, F.L., Hunter, J.E, & Pearlman, K. (1981). Task differences and the validity of aptitude tests in selection: A red herring. Journal of Applied Psychology, 66, 166-185.

Schmidt, F.L., Hunter, J.E., Pearlman, K., & Shane, G.S. (1981). Further tests of the Schmidt-Hunter Bayesian validity generalization model. Personnel Psychology, 32, 257-281.

Schmidt, F.L., Ones, D.S., & Hunter, J.E. (1992). Personnel selection. Annual Review of Psychology, 42, 627-670.

Schmidt, F.L., Hunter, J.E. & Urry, (1976). Statistical power in criterion-related validation studies. Journal of Applied Psychology, 61, 473-485

Schmitt, N., Gooding, R.Z., Noe, R.D., & Kirsch, M. (1984). Meta-analyses of validity studies published between 1964 and 1982 and the investigation of study characteristics. Personnel Psychology, 37, 407-422.

*Schuler, H. (1993). Social validity of selection situations: A concept and some empirical results. In H. Schuler, J.L. Farr, & M. Smith (Eds.), Personnel selection and assessment: Individual and organizational perspectives (pp xx-xx). Hillsdale, N.J.: Erlbaum.

*Schuler, H., Moser, K., Diemand, A. & Funke, U. (1995). Validität eines eisntellungsinterviews zur prognose des ausbildungserfolgs. [Validity of an employment interview for the prediction of training success]. Zeitschrift für Pädagogische Psychologie, 9, 45-54.

*Schuler, H., Moser, K., & Funke, U. (1994). The moderating effect of rater-ratee acquaintance on the validity of an assessment center. Paper presented at the 23th Congress of the International Association of Applied Psychology, July, Madrid.

*Serrano, P., Garcia-Sevilla, L., Perez, J.L., Pina, M. & Ruiz, J.R. (1986). Municipal police evaluation: psychometric versus behavioral assessment. In J.C. Yuille (ed.). Police selection and training. The role of Psychology (pp. 257-265). Dordrecht: Martinus Nijhoff Publishers.

*SHL (1989). Validation Review. Surrey, UK: Saville, & Holdsworth Ltd.

*SHL (1996). Validation Review II. Surrey, UK: Saville & Holdsworth, Ltd.

*Smith, M.C. (1976). A comparison of the trainability assessments and other tests for predicting the practical performance of dental students. International Review of Applied Psychology, 25-26, 125-130.

*Smith-Voskuyl, O.F. (1980). De constructie van een testbatterij voor de selectie van leerling-verplegenden aan de `De Tjonsgerschans´. [The construction of a test battery for the selection of student curses in De Tjonsgerschans]. Unpublished final report

*Sneath, F., & Thakor. M., & Medjuck, B. (1976). Testing of People at Work. London, UK: Institute of Personnel Management.

Spector, P.E., Cooper, C.L., Sparks, K., Bernin, P., Dewe, P., Lu, L., Miller, K., Moraes, L.R., O’Driscoll, M, Pagon, M., Pitariu, H., Poelmans, S., Radhakrishnan, P., Russinova, V., Salamatov, V., Salgado, J.F., Sanchez, J.I., Shima, S., Siu, O.L., Stora, J.B., Teichmann, M., Theorell, T., Vlerick, P., Westman, M., Widerszal-Bazyl, M., Wong, P., & Yu, S. (2001) An International Study of the Psychometric Properties of the Hofstede Values Survey Module 1994: A comparison of individual and country/province level results. Applied Psychology: An International Review, 50, 269-281.

*Spengler, G. (1971). Die praxis der auswahl von führungskräften in der Glanzstoff AG. [The practice in executive selection in Glanzstoff A.G.]. Proceedings of the 17th Congress of the International Association of Applied Psychology. (R. Piret, Editor). Belgique, 25-30 July.

*Spielman, W. (1923). Vocational tests for dressmakers’ apprentices. Journal of the National Institute of Industrial Psychology, 1, 277-282.

*Spielman, W. (1924). The vocational selection of weavers. Journal of the National Institute of Industrial Psychology, 2, 256-261.

*Spielman, W. (1924). Vocational tests for selecting packers and pipers. Journal of the National Institute of Industrial Psychology, 2, 365-373.

*Srinivasan, V., & Weinstein, A. G. (1973). Effects of curtailment on an admissions model for a graduate management program. Journal of Applied Psychology, 58 (3), 339-346.

*Stanbridge, R.H. (1936). The occupational selection of aircraft apprentices of the Royal Air Force. The Lancet, 230, 1426-1430.

*Starren, A.M.L. (1996). De selectie van instroomkandidaten binnen het HR-beleid van een onderneming in de informatie Technologie. Een evaluatie onderzoek. [Applicant selection and the HR-policy of an IT company: an evaluation study]. Paper developed for the major Work and Organizational Psychology, University of Amsterdam, The Netherlands.

*Stevenson, M. (1942). Summaries of researches reported in degree theses. British Journal of Psychology, 12, 182.

*Stratton, G.M., McComas, H.C., Coover, J. E., & Bagby, E. (1920). Psychological tests for selecting aviators. Journal of Experimental Psychology, 3, (6), 405-423.

*Tagg, M.(1924). Vocational tests in the engineering trade. Journal of the National Institute of Industrial Psychology, 2, 313-323.

*te Nijenhuis, J. (1997). Comparability of test scores for immigrants and majority group members in The Netherlands. Unpublished doctoral dissertation. University of Amsterdam.

Te Nijenhuis, J. & van der Flier, H. (1997). Comparability of GATB scores for immigrants and majority group members: Some Dutch findings. Journal of Applied Psychology, 82, 675-687.

*Thurstone, (1976) Manual del test PMA. [Manual for the PMA test]. Madrid: Tea.

*Timpany, N. (1947). Assessment for foremanship. British Journal of Psychology, 38, 23–28.

*Trost, G. & Kirchenkamp, T. (1993). Predictive validity of cognitive and noncognitive variables with respect to choice of occupation and job success. In H. Schuler, J.L. Farr, & M. Smith (Eds.). Personnel selection and asessment (pp 303-314). Hillsdale, NJ: Erlbaum.

*van Amstel, B. & Roe, R.A. (1974). Verslag van twee studies m.b.t. de part-time opleiding kinderberscherming B aan de Sociale Academie te Twente. [Report on two studies of the part-time major `Kinderbescherming B´ of the Social Academy of Twente]. Unpublished report. University of Amsterdam.

* van der Maesen de Sombreff, P.E.A.M. & Zaal, J.N. (1986). Incremental utility of personality questionnaires in selection by the Dutch government. Unpulished paper.

*van Leest, P.F. & Olsder, H.E. (1991). Psychologisch selectieonderzoek en praktijdeoordelingen avn agenten bij de gemeentepolitie Ámsterdam. Research report. Amsterdam: RPD Advies.

*Van Lierde, A.M. (1960). Essai de validation d’un test de suveillance sur une population des chauffeurs d’autobus. [Essay of validation of a vigilance test in a population of bus drivers]. Bulletin du C.E.R.P., 10, 193-205.

Vineberg, R. & Joyner, J.N. (1989). Evaluation of individual enlisted performance. In Wiskoff, M.F. & G.M. Rampton (Eds). Military personnel measurement. Testing, assignment, evaluation (pp. 169-200). New York: Praeger.

Viswesvaran, C. (2002). Absenteeism and measures of job performance: A meta-analysis. International Journal of Selection and Assessment, 10, 12-17.

Viswesvaran, C. & Ones, D.S. (2000). Perspectives on models of job performance. International Journal of Selection and Assessment, 8, 216-226.

Viswesvaran, C. & Ones, D.S. (2002). Agreements and disagreements on the role of general mental ability (GMA) in Industrial, Work, and Organizational Psychology. Human Performance, 15, 211-231.

Viswesvaran, C., Ones, D.S., & Schmidt, F.L. (1996). Comparative analysis of the reliability of job performance ratings. Journal of Applied Psychology, 81, 557-574.

Verive, J.M. & McDaniel, M. (1996). Short-term memory tests in personnel selection: low adverse impact and high validity. Intelligence, 23, 15-32.

*Vernon, P.E. (1950). The validation of Civil Service selection board procedures. Occupational Psychology, 24, 75-95.

*Vlug, T. (1988). De selectie van aspirant verkeersvliegen. Een evaluatie-onderzoek. Werkstuk voor de afstudeerrrichting Arbeids-en Organisatiepsychologie. [The selection of applicant airplane pilots: An evaluation study]. Paper developed for the major Work and Organizational Psychology, University of Amsterdam, The Netherlands.

*Vogelaar, F.J. (1972). Een validatie onderzoek van een selectieprocedure voor politiefunctionarissen. [A validation study of the selection procedure for policemen]. Technical report, University of Amsterdam, The Netherlands.

*Vogue, F. & Darmon, E.G. (1966). Validation d’examens psychotechniques de conducteurs de chariots automoteurs de manutention.[Validity of psychotechnic examinations of truck drivers]. Bulletin du CERP, 15, 183-187.

*Weiss, R.H. (1980). Grundintelligenztest skala 3 – CFT 3. [Group intelligence test scale 3 – CFT 3]. Brannschweig: Georg Westermann.

*Wickham, M. (1949). Follow-up of personnel selection in the A.T.S. Occupational Psychology, 23, 153-168.

*Wittersheim, J.C. & Schlegel, J. (1970). Essai de validation d’une batterie de securité. Le Travail Humain, 33, 281-294.

*Yela, M. (1956). Selección profesional de especialistas mecánicos. [Profesional selection of mechanics]. Revista de Psicología General y Aplicada, 13, 717-719.

*Yela, M. (1968). Manual del Test de Rotación de figuras macizas. [Manual for the complete figure rotated]. Madrid: Tea.

Table 1. Examples of Tests Contributing Validity Coefficients in each Ability Category.

______________________________________________________________________

Ability Test

______________________________________________________________________

GMA Definition: A general capacity of an individual consciously to adjust his thinking to new requirements; it is general mental adaptability to new problems and conditions of life.

Examples: DAT (USA), GATB (USA), T2 (UK), ASDIC

(UK), IST-70 (D), WIT (D), GVK (UK), PMA (USA),

AMPE (E), Matrix (UK), Factor G (UK), Otis (USA),

Alpha Test (USA), Intelligence Logique (F)

CERP-14 (F), Domino (UK), NIIP33 (UK)

Verbal Definition: Ability to understand meaning of words and using them effectively; ability to comprehend language, to understand relationship between words, and to understand meanings of whole sentences and paragraphs.

Examples: IST-70 WA (D), DAT-VR (USA), GATB-Vocabulary (USA), Reading Comprehension Test (UK), Mill-Hill Vocabulary Test (UK)

Numerical Definition: Ability to understand numerical relations and using numbers effectively; ability to comprehend quantitative material.

Examples: IST-70 RA (D), DAT-NR (USA), ATS Arithmetic (UK), Mathematical Test (UK), Arithmetic Test (UK), Arithmetic Reasoning (UK), Vernon's Arithmetic Test (UK)

Spatial/Mechanical Definition: Ability to understand and managing objects in

in a two-dimensional and three-dimensional space; ability to comprehend relations between objects.

Spatial Tests Examples: IST-70 FA (D), Coordinate Reading (USA), Dial Reading (USA), WAIS Blocks (USA), DAT-SR (USA), Squares (UK), Spatial Intelligence (UK), Figures Rotation (E), Judgment of Distance (USA), Visualization Test (UK).

Mechanical Tests Examples: P. Rennes' Test of Mechanics (F), Bennett's Test (USA), DAT-MR (USA), TTB-MT4 (UK), Test of Lievers (F), APU Mechanical Comprehension (UK).

Perceptual Definition: Ability for perceiving stimuli very quick and to give fast and accurate answers;

Perceptual Tests Examples: DAT-CSA (USA), Toulouse-Pieron Test (F), Dichotic Attention Test (Israel), Stroop Test (USA), Cancelation Test (F), Caras (E), Selective Attention Test (F), Diffused Attention Test (F), Forster-Germain Test (E), Perceptotaquimetro (E), Instrument Reading (USA) D2 (D)

Memory Definition: Ability to remember information presented in different perceptual modalities (e.g. visual, auditory);

IST-70 ME (D), Visual Memory Test (UK), Memory of

Words (F), Topographique Memory Test (F), Associative

Memory Test (F), Recognition Memory Test (USA),

______________________________________________________________________

Note. D= Germany; E=Spain; F=France; USA= United States of America; UK= United Kingdom.

Table 2. Empirical Distributions of Test-Retest Reliability of General Mental Ability

and Specific Cognitive Abilities in European Studies of Criterion Validity.

______________________________________________________________________

Ability num coef. rxx SDrxx max min Interval Time (weeks)

______________________________________________________________________

GMA (+Battery) 31 .83 .09 .65 .95 24

Verbal 15 .83 .15 .97 .50 25

Numerical 22 .85 .10 .95 .64 24

Spatial 13 .77 .10 .88 .60 24

Perceptual 5 .67 .18 .90 .52 37

Mechanical 14 .77 .07 .88 .64 24

Spatial+Mechanical 27 .77 .08 .88 .60 24

Memory 1 .64 -- -- -- 78

______________________________________________________________________

Table 3. Empirical Distributions of Criteria Reliability in European Studies of Criterion

Validity for General Mental Ability and Cognitive Abilities.

______________________________________________________________________

Criterion K N ryy SDryy

______________________________________________________________________

Job Performance Ratings 19 1,900 .52 .19

Training Success 15 2,897 .56 .09

______________________________________________________________________

Table 4. Empirical Distributions of Range Restriction Ratio for General Mental Ability

and Cognitive Abilities (u = SDsample/SDpopulation).

______________________________________________________________________

Type K N uw SDw

______________________________________________________________________

Ability-JPR Studies 20 1,795 .62 .25

Ability-Training Studies 12 2,717 .67 .14

______________________________________________________________________

Note. JPR = Job Performance Ratings

Table 5. Meta-analysis of General Mental Ability and other Cognitive Ability Tests for Predicting Job Performance Ratings.

_______________________________________________________________________________________________________________________________

Source K N r S2r S2e SDr Rho SDrho %VE 90%CV LRHO NSD LCV

_______________________________________________________________________________________________________________________________

GMA 93 9,554 .29 .031 .008 .176 .62 .19 75 .37 .56 .25 .24

Verbal 44 4,781 .16 .026 .009 .161 .35 .24 53 .04 .32 .25 .00

Numerical 48 5,241 .24 .014 .008 .118 .52 .00 100 .52 .47 .15 .28

Spatial-Mechanical 40 3,750 .23 .038 .010 .195 .51 .29 52 .13 .46 .31 .06

Perceptual 38 3,798 .24 .028 .009 .167 .52 .19 73 .28 .47 .23 .17

Memory 14 946 .26 .017 .013 .130 .56 .00 100 .56 .51 .16 .30

______________________________________________________________________________________________________________________________

Note. K=number of studies, N=total sample; r=observed mean validity; S2r=observed variance; S2e=sampling error variance; SDr= observed standard deviation; Rho=operational validity; Sdrho=standard deviation of the operational validity; %VE = variance accounted for by artifactual errors;

90%CV=credibility value; LRHO= lowest (hypothetical) rho; NSD= hypothetical standard deviation of LRHO; LCV= lowest (hypothetical) credibility value.

Table 6. Meta-analysis of General Mental Ability and Cognitive Ability Tests for Predicting Training Success.

_______________________________________________________________________________________________________________________________

Source K N r S2r S2e SDr Rho SDrho %VE 90%CV LRHO NSD LCV

_______________________________________________________________________________________________________________________________

GMA 97 16,065 .28 .020 .005 .141 .54 .19 47 .29 .49 .24 .19

Verbal 58 11,123 .23 .017 .005 .130 .44 .19 45 .20 .40 .22 .12

Numerical 58 10,860 .25 .017 .005 .130 .48 .18 46 .24 .44 .22 .15

Spatial-Mechanical 84 15,834 .20 .017 .015 .130 .40 .19 43 .16 .36 .21 .09

Perceptual 17 3,935 .13 .015 .004 .122 .25 .20 34 .00 .23 .20 -.03

Memory 15 3,323 .17 .017 .004 .130 .34 .20 35 .08 .31 .21 .03

_______________________________________________________________________________________________________________________________

Note. K=number of studies, N=total sample; r=observed mean validity; S2r=observed variance; S2e=sampling error variance; SDr= observed standard deviation; Rho=operational validity; Sdrho=standard deviation of the operational validity; %VE = variance accounted for by artifactual errors;

90%CV=credibility value; LRHO= lowest (hypothetical) rho; NSD= hypothetical standard deviation of LRHO; LCV= lowest (hypothetical) credibility value.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download

To fulfill the demand for quickly locating and searching documents.

It is intelligent file search solution for home and business.

Literature Lottery

Related searches