Saturday, December 26, 2020

Data Talks

House in Tuscany, 1903 by Hans Emmenegger

House in Tuscany, 1903 by Hans Emmenegger

Click to enlarge.


Friday, December 25, 2020

The range of the data (the difference between the highest and lowest scores) was nearly identical, although the groups were otherwise quite different.

From Science Fictions by Stuart Ritchie.  Page 63.

This kind of reasoning is what caught out social psychologists Lawrence Sanna and Dirk Smeesters in 2011. Sanna published a study in which he claimed to find that people are more prosocial when standing at higher elevations; Smeesters claimed to show that seeing the colours red and blue affects how people think about celebrities.  The results in both papers looked impressive at first glance, easily confirming their proposed theories about human behaviour. But a closer look revealed something distinctly odd. The psychologist Uri Simonsohn showed that in the various groups in Sanna’s experiment, the range of the data (the difference between the highest and lowest scores) was nearly identical, although the groups were otherwise quite different. Simonsohn calculated that the chances of this happening in real data were minuscule. It was the same for Smeesters, except it was the averages of his groups that were too similar; again, these similarities just weren’t consistent with what would happen in real data, where error would have nudged the numbers further apart.  Once these problems, among others, were exposed, the offending papers were retracted, and both researchers resigned in disgrace. 

These kinds of statistical red flags are analogous to  what makes your bank freeze your credit card after it’s suddenly used to spend large sums on a tropical cruise: unusual activity that’s out of line with normal expectations, and which might be due to fraud.  And there are a host of other features of fraudulent data that might cause readers to become suspicious when they dig into the details. The dataset might look a little too immaculate, for example, with too few missing datapoints, which come about for all sorts of reasons in real datasets: participants dropping out of the study or instruments failing, for example. Perhaps the distribution of numbers might not follow certain expected mathematical rules.  Or the effects might be vastly larger than seems plausible in the real world, and thus too good to be true.

 

History

 

An Insight

 

I see wonderful things

 

Offbeat Humor

 

Data Talks

 

Christmas Holidays by Trevor Mitchell

 Christmas Holidays by Trevor Mitchell 

Click to enlarge.


Thursday, December 24, 2020

They’re predictably unpredictable

From Science Fictions by Stuart Ritchie.  Page 62.

“Fortunately, just as it’s a monumentally difficult task to forge a compelling Rembrandt or Vermeer (or a compelling western blot), it’s not at all easy to fake a dataset convincingly. Data pulled out of thin air don’t have the properties we’d expect of data collected in the real world.65 Fundamentally, this is because no science is really an exact science: numbers are noisy. Every time you try to measure anything, you’ll be slightly off from the true value, be it the economic performance of a country, the number of rare orangutans left in the world, the speed of a subatomic particle, or even something as simple as how tall someone is. With height, for instance, the person might be a bit slouched, your tape measure might slip by a fraction of an inch, or you might accidentally write down the wrong number. This is called measurement error, and it’s hard to get around completely, even if there are ways to reduce it. 
 
Measurement error’s equally annoying cousin is sampling error. As scientists we can rarely, if ever, examine every single instance of a phenomenon – no matter whether we’re trying to study a set of cells, or exoplanets, or surgical operations, or financial transactions. Instead, we take samples, and try to generalise from them to the set as a whole (statisticians call the whole set the ‘population’, even if it’s not a set of people). The trouble is, the characteristics of any given sample you take (say, the average height of all the people in your study) are never a precise match to what you really want to know (say, the average height of all the people in the country). Just through the random chance of who was included, every sample will have a marginally different average. And some samples, again just by chance, might be wildly different from the true average in the overall set.
 
Both measurement error and sampling error are unpredictable, but they’re predictably unpredictable. You can always expect data from different samples, measures or groups to have somewhat different characteristics – in terms of the averages, the highest and lowest scores, and practically everything else. So even though they’re normally a nuisance, measurement error and sampling error can be useful as a means of spotting fraudulent data. If a dataset looks too neat, too tidily similar across different groups, something strange might be afoot. As the geneticist J. B. S. Haldane put it, ‘man is an orderly animal’ who ‘finds it very hard to imitate the disorder of nature’, and that goes for fraudsters as much as for the rest of us.