Validity & Generalisability



VALIDITY, GENERALISABILITY AND RELIABILITY

[pic]

VALIDITY (INTERNAL VALIDITY) When we ask whether a piece of research is valid or not we are essentially asking "is this true?" Answering this question will inevitably involve a degree of subjective judgement, but by managing the various threats to the validity of our research we can improve our chances of producing valid work. A study is internally valid if it describes the true state of affairs within its own setting.

GENERALISABILITY (EXTERNAL VALIDITY) A research finding may be entirely valid in one setting but not in another. Generalisability describes the extent to which research findings can be applied to settings other than that in which they were originally tested. A study is externally valid if it describes the true state of affairs outside its own setting.

WHICH IS MORE IMPORTANT? Both are important, but a study that is valid but not generalisable is at least useful for informing practice in the setting in which it was carried out, whereas there is no point trying to generalise the findings of an invalid study.

Validity (accuracy)

❑ 'An agreement between two efforts to measure the same thing with different methods' -- Campbell and Fisk (as cited in Hammersley, 1987)

❑ 'The measure that an instrument measures what it is supposed to' -- Black and Champion (1976, pp. 232-234)

❑ 'Accuracy' -- Lehner (1979, p. 130)

❑ 'Degree of approximation of 'reality' -- Johnston and Pennypacker (1980, pp. 190-191)

❑ 'Are we measuring what we think we are?' -- Kerlinger (1964, pp. 430, 444-445)

❑ 'to the extent that differences in scores yielded…reflect actual differences' -- Medley and Mitzel (as cited in Hammersley, 1987, p. 150)

In other words, are the means of measurement accurate? Are they measuring what they claim to measure? For example, if you design a questionnaire to examine girls’ attitudes to PE, does it accurately capture those attitudes?

THREATS TO VALIDITY There are broadly three reasons why findings may not be valid:

1) CHANCE The measurements we make while doing research are nearly always subject to random variation. Determining whether findings are due to chance is a key feature of statistical analysis. Check our statistics links to find out more about hypothesis testing and estimation. The best way to avoid error due to random variation is to ensure your sample size is adequate.

2) BIAS Whereas chance is caused by random variation, bias is caused by systematic variation. A systematic error in the way we select our subjects (the people we wish to study), measure our outcomes, or analyse our data will lead to results that are inaccurate. There are numerous types of bias that may effect a study. Understanding how bias occurs is more important than remember the names of different types of bias. The sample will be biased if it is not representative of the population, or there is a consistent inaccuracy in measurement or method.

3) CONFOUNDING This is similar to bias and is often confused. However, whereas bias involves error in the measurement of a variable, confounding involves error in the interpretation of what may be an accurate measurement. A classic example of confounding is to interpret the finding that people who carry matches are more likely to develop lung cancer as evidence of an association between carrying matches and lung cancer. Smoking is the confounding factor in this relationship- smokers are more likely to carry matches and they are also more likely to develop lung cancer.

RELIABILITY (REPLICABILITY)

Reliability is essentially about how much we can trust the measure used to give the same results if used again.

❑ 'An agreement between two efforts to measure the same thing with the same methods' -- Campbell and Fisk (as cited in Hammersley, 1987)

❑ 'Ability to measure consistently' -- Black and Champion (1976, pp. 232-234)

❑ 'Reproductibility of the measurements…stability' -- Lehner (1979, p. 130)

❑ 'Capacity to yield the same measurement…stability' -- Johnston and Pennypacker (1980, pp. 190-191)

❑ 'Accuracy or precision of a measuring instrument?' -- Kerlinger (1964, pp. 430, 444-445)

❑ 'To the extent that the average difference between two measures obtained in the same classroom is smaller than…in different classrooms' -- Medley and Mitzel (as cited in Hammersley, 1987

Are the means of measurement transferrable? Are they measuring what they claim to measure? So in our example above about the questionnaire exploring girls attitudes to PE, reliability would be the extent to which the questionnaire was equally reliable as a tool if used with different groups of girls.

Critical Perspectives on the Concept of Validity

Because validity is concerned with TRUTH, an understanding of the nature of ‘truth’ is central to any theorisation of ‘validity’. This relates to whether your beliefs about ‘truth’ are positivist or interpretivist

varying epistemologies and methodologies generate varying notions of validity. Hammersley (1987, p. 69) defines validity as: "An account is valid or true if it represents accurately those features of the phenomena, that it is intended to describe, explain or theorise." Although this would seem to be an all-encompassing and reasonable description, many other definitions fail to envisage such a 'realist approach' (Denzin & Lincoln, 1998, p. 282). The fact that there are so many possible definitions and replacement terms for 'validity' suggests that it is a concept entirely relative to the person and belief system from which it stems.

One of the most recurring features in critical discussions of 'validity' is the combination of 'validity' with the term 'reliability'. Yet, the definitions for 'reliability' are as varied and as complex as those for 'validity'.

Traditional criteria for “validity” arise out of the quantitative (positivist) paradigm. (researcher involvement reduces validity). Qualitative research (post-positivist) rejects universal truth and concerns itself with relative truth. Truth is negotiated. (denying researcher involvement reduces validity)

Interpretive Researchers often think of reliability, validity and generalisability rather differently, often preferring to refer to trustworthiness of the research. Questions that interpretivist researchers might ask about validity include:

❑ Can data “speak for itself”?

❑ What is a “valid” interpretation of data?

❑ How is the interpretation made?

❑ Whose interpretations are the most valid?

❑ Valid for whom?

❑ Are all interpretations equally (in)valid?

❑ What is the role of consensus?

❑ Is interpretive research inherently relativistic?

❑ Is “understanding” more important that “validity”?

Generalisability is being replaced by other approaches which are dynamic and iterative rather than static.

These approaches are iterative, process focussed, interventionist, collaborative, multileveled, utility oriented, and theory driven.

They abandon the presumption of generalisability as a desirable criterion and acknowledge that generalisability is a historically specific phenomenon, invented under particular circumstances, and became indispensable to educational research despite many rational and logical arguments against it.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download