Validity: the Overlooked Issue in Big Data

Validity is an important, but often overlooked issue whenever measurement and data analysis is involved and this includes Big Data applications.  Like Steve Lohr’s concerns is his NY Times article on the potential pit falls of Big Data (Do the models make sense?  Are decision makers trained and using data appropriately?) or Nassim Taleb’s article, Beware the Big Errors of Big Data, validity concerns are paramount, but the nature of vlaidity is not addressed.

Validity is an overall evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of interpretations and actions on the basis of test scores or other modes of assessment (Messick, S, 1995, Validity of Psychological Assessments, p.741). Also available here

That is to say, when we look at data analytics, are the results justifiable.  Just having data doesn’t make it right. Big Wrong Data can be a dangerous thing.

As big data becomes a larger part of our everyday life, validity must also becomes a critical component of analysis; especially if big datas is to find success beyond the current fashion. As Samuel Messick (ibid) said;

. . . validity, reliability, comparability, and fairness are not just measurement principles, they are social values that have meaning and force outside of measurement whenever evaluative judgments and decisions are made. (Messick, Ibid).

This importance is not reflected in the scant treatment that validity often receives in data and measurement training or in most discussions of big data. The modern view of Validity (after Samuel Messick) is about more then judging the rightness of one’s measures, it is also about the transparency of the assumptions and connections behind the measurement program and processes. I’ll propose the following (non-exhaustive) list as a place to begin when judging the the use of data and measurement:

  • Content Validity – Data and measurement are used to answer questions and the first step in quality measurement is getting the question right and clear. Measurement will not help if we’re asking the wrong questions or are making the wrong inferences from ambiguous questions. When questions are clear you can proceed to begin linking questions to appropriate construct measures.
  • Structural Fidelity – Additional information should show how assessment tasks and data models relate to underlying behavioral process and the contexts to which they can be said to apply.  Understand the processes that underly the measures.
  • Criterion Validity – This examines convergent and discriminant empirical evidence in correlations with other pertinent and well understood criterion measures. Do your results make sense in light of previous measures.
  • Consequential Validity – Of particular importance are the observed consequences of the decisions that are being made. As Lohr’s article points out, our data based operations do not just portray the world, but play an active role in shaping the empirical world around us. It’s important to compare results with intentions.

Good decisions are based on data and evidence, but inevitably will rely on many implicit assumptions. Validity is about making these assumptions explicit and justifiable in the decision making process.

“The principles of validity apply not just to interpretive and action inferences derived from test scores as ordinarily conceived, but also to inferences based on any means of observing or documenting consistent behaviors or attributes. . . . Hence, the principles of validity apply to all assessments . . .”(Messick, ibid, p.741).

Reference – Messick, S. (1995). Validity of Psychological Assessments: Validation of Inferences From Persons’ Responses and Performances as Scientific Inquiry Into Score Meaning, American Psychologist, 50, 741-749.