My PhD was in education psychology and most of my classes occurred in the mid to late 90s. The paradigm wars were winding down, but there was still a noticeable split between hermeneutic social constructionists* and the psychometrician. My nature is to want to synthesize, often leading one to walk in two worlds. Too what would I be drawn; a hermeneutic account of psychometrics of course.
I was investigating dissertation topics around disability. The split here was conveyed as between old psychometric ways of conceiving of disability and new socially constructed accounts. An advisor made a casual comment that my concerns seemed to be about validity and it seemed insightful. Yes! The problem was that existing measures were validated by psychometric models that did not account for the hermeneutics of identity construction or for the consequences of resulting identities.
I started my investigation by reading Samuel Messick’s chapter on Validity in Educational Measurement (3rd ed.). What I read was Messick’s attempt to address hermeneutic aspects of measurement from a psychometric perspective. What was important in measuring, is the meaning you derive from the data and the associated implication for action. First, there are only 2 ways to think about invalidity:
- Construct Under-representation; The construct you are interested in is larger than what your assessment is able to measure.
- Construct Irrelevance; you are measuring things that are irrelevant to the information you need to take action and lead to either false positives or false negatives.
Messick would later write about six categories of validity concerns. I take these categories to be a framework for how to think about or find meaning in measurement. They are 6 different way of looking for under-representation or irrelevance:
- Content – Is there evidence that the scope of the content appropriate and representative of the construct.
- Substantive – Is there a theory for the processes and tasks being performed and is there empirical support for the theory.
- Structural – Is there evidence that the assessment faithfully reproduces the tasks or processes in contexts or in the natural settings to which you are trying to extrapolate.
- Generalization – Has the assessment been shown to apply to many different groups, contexts and over time. While this may not reduce validity in specific situations, it would indicate to look much closer at the situation your in.
- External – convergent or divergent criterion evidence.
- Consequential – Is there evidence that your actions are improved by the assessment and that it is fair and free of bias.
* Note – I have no interest in most philosophical discussions of the beliefs of social constructionist or realists. For me, SC is mostly about the ways that things and people are thoroughly effected and affected by the pervasiveness of language and its accompanying hermeneutics. Not only is there no denial of reality, the current trend is to highlight the embodied nature of our living even as it is totally inhabited by hermeneutics. I fall back on pragmatics, not because it is defensible, but because it is a way to go on. Most other discussions are about drawing boundaries that are just too fluid to nail down in a convincing manner.