Thursday, December 10, 2015

Strawman argument that is self-serving

An interesting argument in Why I worry experimental social science is headed in the wrong direction by Chris Blattman.

The general issue is that in recent years, many of the softer social sciences such as sociology and psychology have been shown to have had very little adherence to the scientific method and consequently most of the "results" from research fail to replicate. In other words, the findings are not real. It is not just social sciences that fall into this bad habit but it is prevalent there. In other fields there are similar issues such as a false reliance on p-values as a threshold of validity.

The consequences is that much of what has been presented as true is actually cognitive pollution with often negative consequences when this pollution taints policy setting.

Blattman's catalyst is a paper from Alwyn Young.
I follow R.A. Fisher’s The Design of Experiments, using randomization statistical inference to test the null hypothesis of no treatment effect in a comprehensive sample of 2003 regressions in 53 experimental papers drawn from the journals of the American Economic Association.

Randomization tests reduce the number of regression specifications with statistically significant treatment effects by 30 to 40 percent. An omnibus randomization test of overall experimental significance that incorporates all of the regressions in each paper finds that only 25 to 50 percent of experimental papers, depending upon the significance level and test, are able to reject the null of no treatment effect whatsoever. Bootstrap methods support and confirm these results.
Young is finding that 50-75% of reported findings can be dismissed.

But Blattman is concerned about the trend towards ever higher bars for accepting evidence.
Take experiments. Every year the technical bar gets raised. Some days my field feels like an arms race to make each experiment more thorough and technically impressive, with more and more attention to formal theories, structural models, pre-analysis plans, and (most recently) multiple hypothesis testing. The list goes on. In part we push because want to do better work. Plus, how else to get published in the best places and earn the respect of your peers?

It seems to me that all of this is pushing social scientists to produce better quality experiments and more accurate answers. But it’s also raising the size and cost and time of any one experiment.

This should lead to fewer, better experiments. Good, right? I’m not sure. Fewer studies is a problem if you think that the generalizabilty of any one experiment is very small. What you want is many experiments in many places and people, which help triangulate an answer.
Blattman lists a dozen reasons why he is concerned about higher standards of evidence.

My view of this is that Blattman is taking too binary an approach. He is arguing a strawman - Lots of little poorly designed studies which might be suggestive versus a few big well designed studies which can be relied upon.

I don't think anyone is actually arguing that. The innovation process certainly depends on seeing things in new ways and trying things that haven't been tried before. Because they have little prospect of being true or useful, you find ways to experiment cheaply. But once you have done 100 or 1,000 experimentations, you are going to find something that works. At that point, you have to test it at scale and in real world circumstances. If it still works, then you convince the regulators that it is safe and bankers that it is worthwhile. In other words there is a continuum of testing from conjectural to robust.

The problem in the social sciences is that almost all of it is conjectural and very little is robust. Sure, it is cheap and easy to test a new idea using forty undergraduate psych majors on a week long experiment. But that really doesn't tell you all that much. If the result is an affirmation of the idea, then you have to move on to the next step and test it on a diverse set of students. And then a set of non-students, etc.

Blattman is correct that if there is a fixed budget of "research resources", then including more robust experiments will reduce the amount of time and resources spent on the conjectural materials. But if social sciences invest in robust research, it may end up being sufficiently valuable, that it warrants further investments in the whole chain of innovation, including more conjectural research.

Right now, Blattman's argument feels like special pleading to keep the life of a social sciences researcher easy and unaccountable. But in a world of limited resources, accountability is no bad thing. It allows us to focus on what makes a positive difference.

No comments:

Post a Comment