Why Be Concerned with Validity: My Personal Experience

My PhD was in education psychology and most of my classes occurred in the mid to late 90s.  The paradigm wars were winding down, but there was still a noticeable split between hermeneutic social constructionists* and the psychometrician.  My nature is to want to synthesize, often leading one to walk in two worlds.  Too what would I be drawn; a hermeneutic account of psychometrics of course.

I was investigating dissertation topics around disability.  The split here was conveyed as between old psychometric ways of conceiving of disability and new socially constructed accounts.  An advisor made a casual comment that my concerns seemed to be about validity and it seemed insightful.  Yes!  The problem was that existing measures were validated by psychometric models that did not account for the hermeneutics of identity construction or for the consequences of resulting identities.

I started my investigation by reading Samuel Messick’s chapter on Validity in Educational Measurement (3rd ed.).  What I read was Messick’s attempt to address hermeneutic aspects of measurement from a psychometric perspective.  What was important in measuring, is the meaning you derive from the data and the associated implication for action.  First, there are only 2 ways to think about invalidity:

  • Construct Under-representation; The construct you are interested in is larger than what your assessment is able to measure.
  • Construct Irrelevance; you are measuring things that are irrelevant to the information you need to take action and lead to either false positives or false negatives.

Messick would later write about six categories of validity concerns.  I take these categories to be a framework for how to think about or find meaning in measurement.  They are 6 different way of looking for under-representation or irrelevance:

  1. Content – Is there evidence that the scope of the content appropriate and representative of the construct.
  2. Substantive – Is there a theory for the processes and tasks being performed and is there empirical support for the theory.
  3. Structural – Is there evidence that the assessment faithfully reproduces the tasks or processes in contexts or in the natural settings to which you are trying to extrapolate.
  4. Generalization – Has the assessment been shown to apply to many different groups, contexts and over time.  While this may not reduce validity in specific situations, it would indicate to look much closer at the situation your in.
  5. External – convergent or divergent criterion evidence.
  6. Consequential – Is there evidence that your actions are improved by the assessment and that it is fair and free of bias.

* Note – I have no interest in most philosophical discussions of the beliefs of social constructionist or realists.  For me, SC is mostly about the ways that things and people are thoroughly effected and affected by the pervasiveness of language and its accompanying hermeneutics.  Not only is there no denial of reality, the current trend is to highlight the embodied nature of our living even as it is totally inhabited by hermeneutics.  I fall back on pragmatics, not because it is defensible, but because it is a way to go on.  Most other discussions are about drawing boundaries that are just too fluid to nail down in a convincing manner.

The Difficulty in Measuring

Measurement is, or should be, a concept that is at the center most peoples’ practice.  It comes in many forms: “No Child Left Behind” in education, “evidence-based practice” in medicine and psychology, six sigma in manufacturing, balanced scorecard in management, or performance improvement in human-resources.  All of these programs have measurement at the process core and the results of these processes begin with the quality of the measure and the ability to target measures to illuminate the intended purpose.  But, much of the efforts that are made to impliment these programs focus much more on the methodology that surround the measures, than on the measures themselves.

Achieving quality measures and quality data is not that easy.  Understanding this begins  with the idea of the measurement construct. 

“In philosophy of science, a ‘construct‘ is an ideal object (i.e., one whose existence depends on a subject’s mind), as opposed to “real objects” (i.e., those whose existence is non dependent on a subject’s mind)”

 “Measurement is the process of assigning a number to an attribute (or phenomenon) according to a rule or set of rules” (Wikipedia.com).

Measurement assigns numbers to constructs; attributes that are idea objects.  Measures do not create real objects.  Some of these objects are more problematic in definition (like personality) than others (like temperature),  but they can all be defined as constructs or idea objects.

This leads to the importance of the concept of validity in measurement that will be the next topic.