Translating experience and empirical evidence into useful knowledge (knowledge which allows us to make reliable forecasts).
There are big problems with this approach. One obvious one is that it is often impossible or impractical to run the experiment. But even if we assume that I have done exactly this experiment, I still have the problem of measuring the causal effect of the intervention. In a complicated system, like shoe stores, I have to answer the question of how many pairs I would have sold in the, say, three months after changing my design to narrow toes - I can't just assume that I would have sold the same number of wide-toed shoes that I did in the prior three months. For reasons well-known to you, and that I go through at length in the book, the best way to measure this in a complicated system is a randomized field trial (RFT) in which I randomly assign some stores to get the new shoes and others to keep selling the old shoes. In essence, random assignment allows me to roughly hold constant all of the "screwy" effects that you reference between the test and control group.
But what many cheerleaders for randomized experiments gloss over is that even if I have executed a competent experiment, it is not obvious how I turn this result in to a prediction rule for the future (the problem of generalization or external validity). Here's how I put this in an article a couple of years ago:
In medicine, for example, what we really know from a given clinical trial is that this particular list of patients who received this exact treatment delivered in these specific clinics on these dates by these doctors had these outcomes, as compared with a specific control group. But when we want to use the trial's results to guide future action, we must generalize them into a reliable predictive rule for as-yet-unseen situations. Even if the experiment was correctly executed, how do we know that our generalization is correct?A physicist generally answers that question by assuming that predictive rules like the law of gravity apply everywhere, even in regions of the universe that have not been subject to experiments, and that gravity will not suddenly stop operating one second from now. No matter how many experiments we run, we can never escape the need for such assumptions. Even in classical therapeutic experiments, the assumption of uniform biological response is often a tolerable approximation that permits researchers to assert, say, that the polio vaccine that worked for a test population will also work for human beings beyond the test population.
But as we climb a ladder of phenomenological complexity from physics to biology to sociology, this problem of generalization becomes more severe. As I put it in Uncontrolled:
We can run a clinical trial in Norfolk, Virginia, and conclude with tolerable reliability that "Vaccine X prevents disease Y." We can't conclude that if literacy program X works in Norfolk, then it will work everywhere. The real predictive rule is usually closer to something like "Literacy program X is effective for children in urban areas, and who have the following range of incomes and prior test scores, when the following alternatives are not available in the school district, and the teachers have the following qualifications, and overall economic conditions in the district are within the following range." And by the way, even this predictive rule stops working ten years from now, when different background conditions obtain in the society.We must have some model that generalizes. What we really need to do is to build a distribution of results of "experiments + model" in predicting the results of future experiments.
No comments:
Post a Comment