Friday, March 30, 2018

The answer, to a first order of magnitude, is none.

From Is it fair to say that most social programmes don’t work? by 8000 Hours.

To save you the state of anxious anticipation; Yes, it is fair to say that most social programs don't work.

The authors come across as fighting against the conclusion and they do raise many excellent points about the difficulty of conducting randomized control trials (RCTs), the gold standard of research replication. To me, this is the most substantive part of the essay:
In 2015, the Arnold Foundation published a survey of the literature on programmes that had been tested with randomised controlled trials (RCTs) as part of a request for funding proposals. It found the following:
Education: Of the 90 interventions evaluated in RCTs commissioned by the Institute of Education Sciences (IES) since 2002, approximately 90% were found to have weak or no positive effects.

Employment/training: In Department of Labor-commissioned RCTs that have reported results since 1992, about 75% of tested interventions were found to have found weak or no positive effects.

Medicine: Reviews have found that 50-80% of positive results in initial (“phase II”) clinical studies are overturned in subsequent, more definitive RCTs (“phase III”).

Business: Of 13,000 RCTs of new products/strategies conducted by Google and Microsoft, 80- 90% have reportedly found no significant effects.

The current pace of RCT testing is far too slow to build a meaningful number of proven interventions to address our major social problems. Of the vast diversity of ongoing and newly initiated program activities in federal, state, and local social spending, only a small fraction are ever evaluated in a credible way to see if they work. The federal government, for example, evaluates only 1-2 dozen such efforts each year in RCTs.
The range of failure is 50-90% with most clustered at ~80% fail to replicate. It is important to recall that RCTs are expensive to conduct and therefore it is not unlikely that only the most likely to replicate studies are tested. Also, cost is not taken into account (i.e. they are not testing whether the cost per outcome is reasonable.) All they are checking is whether the claimed outcome has a material positive outcome. 80% do not. If you added cost benefit to the equation, I am guessing the 80% rises significantly and if you were to do RCTs on all hypotheses, not just the ones most likely to succeed, then the failure rate is probably even higher still.

Yes, it is hard to define what constitutes whether a social program has worked but if you specify that working means that it achieves the stated goals at an affordable cost, then the answer, to a first order of magnitude, is none.

No comments:

Post a Comment