Thursday, November 9, 2017

Fewer than 10% of all proposed policy solutions actually have the beneficial outcome anticipated

In the last decade there has been swelling concern about the quality of investigative and experimental science being conducted, a concern sometimes associated with the idea of the politicization of science (ex: anthropogenic global warming), and sometimes associated with ideology (ex: implicit attitude tests). Psychology and sociology have been especially hard hit with the great majority of popular findings failing to replicate. But the hard sciences have surprisingly high failure rates as well.

Statistical failures are the biggest issue (small sample science, non-random population samples, cherry-picked data, p-hacking, etc..)

A number of approaches have been undertaken to address the crisis. Statistical rigor is more in demand. Greater attention has been brought to the discipline of conducting replication tests. Effort has been made to tackle publication bias. Another initiative has been to require preregistration of clinical trials. The general idea is that preregistration requires greater diligence and planning up-front, thus reducing accidental oversights and errors. In addition, the increased transparency of preregistration, and particularly the availability of results, is anticipated to reduce conscious and unconscious p-hacking and cherry-picking.

Chris Blattman shares interesting information in a humorous fashion in Preregistration of clinical trials causes medicines to stop working!
Something must be done to combat this public health hazard. In 2000, the National Heart Lung, and Blood Institute (NHLBI) began requiring that researchers publicly register their research analysis plan before starting their clinical trials. From a new PLOS paper:
We identified all large NHLBI supported RCTs between 1970 and 2012 evaluating drugs or dietary supplements for the treatment or prevention of cardiovascular disease.

17 of 30 studies (57%) published prior to 2000 showed a significant benefit of intervention on the primary outcome in comparison to only 2 among the 25 (8%) trials published after 2000 (χ2=12.2,df= 1, p=0.0005). There has been no change in the proportion of trials that compared treatment to placebo versus active comparator. Industry co-sponsorship was unrelated to the probability of reporting a significant benefit. Pre-registration in clinical trials.gov was strongly associated with the trend toward null findings.
This is illustrated by:
Click to enlarge.

Of course what this is actually showing is that with greater clinical trial planning and rigor and with greater transparency of results, there are fewer incidences of unconscious p-hacking and cherry-picking. Before 2000, without those disciplines, there were many more trials showing what we might now interpret as false positives. Since 2000, the scientific rigor has gotten better and with that improved rigor, we discover that there are fewer real positive results than we might once have believed.

A tentative heuristic from these results could be that only 8% of all proposed policy solutions actually have the beneficial outcome anticipated.

Pick a hundred proposed human system policies intended to address various human problems - better nutrition for children in school, more play time in school, pre-K education, solar wind energy generation, higher minimum wages, tighter gun control, lower personal taxes, etc.. All of these are well intended. All of them are estimated to be real solutions to more or less agreed upon problems. If your list has a hundred policies to real problems and you now know only 8% are effective policy solutions, how do you go about identifying which are the eight that are most likely to work and which are the ninety-two which should be discarded?

An interesting exercise.

No comments:

Post a Comment