Wednesday, December 7, 2022

Be more humble in how much confidence we bring to our analyses.

From Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results by R. Silberzahn, E. L. Uhlmann, B. A. Nosek, et al.  From the Abstract.  Emphasis added.

Twenty-nine teams involving 61 analysts used the same data set to address the same research question: whether soccer referees are more likely to give red cards to dark-skin-toned players than to light-skin-toned players. Analytic approaches varied widely across the teams, and the estimated effect sizes ranged from 0.89 to 2.93 (Mdn = 1.31) in odds-ratio units. Twenty teams (69%) found a statistically significant positive effect, and 9 teams (31%) did not observe a significant relationship. Overall, the 29 different analyses used 21 unique combinations of covariates. Neither analysts’ prior beliefs about the effect of interest nor their level of expertise readily explained the variation in the outcomes of the analyses. Peer ratings of the quality of the analyses also did not account for the variability. These findings suggest that significant variation in the results of analyses of complex data may be difficult to avoid, even by experts with honest intentions. Crowdsourcing data analysis, a strategy in which numerous research teams are recruited to simultaneously investigate the same research question, makes transparent how defensible, yet subjective, analytic choices influence research results.

From Choices, oh my, the choices by John Mandrola commenting on this paper (and one other).  The subheading is I used to think *the* data yielded *the* result. This paper challenged that notion.

Here is the point.

Let’s say you read a study that finds a statistically significant association. Let’s then say the odds ratio is 1.20, which means the exposure of something may increase the odds of an effect by 20%.

The 95% confidence intervals are 1.02-1.40, roughly meaning that had the experiment been run an infinite number of times, the true result would fall within a 2% increase or 40% increase.

This is a statistically significant result. The researchers will make positive conclusions. Media may cover it as a positive study, potentially a breakthrough. Public or medical opinion may change.

Yet this result came from the researchers’ chosen analytic method. What if they had chosen a different way to approach the analysis, like the 29 research teams in Nosek’s study?

There might be different results…and different conclusions…and different policies.

I focus on complex systems.  Specifically, how can we understand and forecast systems which are loosely coupled, multi-causal, autonomous. non-linear, multivariate, multi-system, dynamic, context dependent and self-evolving, chaotic, with unclear feedback loops, homeostatic attributes, and characterized by noisy data?  This describes virtually all systems related to people (economy, health, education, etc.) and biological systems.  

These are hard to parse and hard to segment into components of greater or lesser understanding.  What these papers are pointing out is that independent of the challenge of the systems themselves, we have to incorporate a further healthy degree of uncertainty given that what we might think we know is highly contingent on the analytical methods brought to bear.  

This makes sense and seems obvious in a way in hindsight but I would have been willing to bet reasonable money that, while their might be degrees of variability in the 29 techniques, that they would have come up with broadly the same finding.  

Be more humble in how much confidence we bring to our analyses.



No comments:

Post a Comment