Tuesday, October 11, 2016

When data visualization obfuscates more than it illuminates

I am very enthusiastic about the visual display of data to achieve more effective communication. Link between health spending and life expectancy: US is an outlier by Max Roser is an example where the visual display actually fosters miscommunication. The problem isn't with the display but the context and underlying analysis.

The article includes this chart.

The US is indeed an outlier in terms of its expenditures on healthcare. It is worth understanding what is driving those outsized medical expenses. But the chart doesn't really help that process.

The root cause analysis by Roser is superficial or, at least, incomplete. Roser identifies three possible root causes for the cost differences. 1) The US has more violence - however, Roser seems to dismiss this as a real driver. 2) The US has higher administrative health care costs. 3) The US has more unequal healthcare spending.

So, one reading of this chart is that the US is more violent, more incompetent and more unfair because of inequality. That certainly comports with the nattering nabobs of the intelligentsia, but is it true?

I don't know what the real drivers of the cost differential are. I just know that the proffered suggestions are either substantially incomplete or are untrue.

For example, Roser seems to conflate inequality of healthcare spending with inequality of income.
One of the reasons for the underachievement of the US is the large inequality in health spending. The chart above showed that average per capita spending on health is exceptionally high, but the average does not tell you about how much each individual in the US receives. The US healthcare system is characterized by little access to care for some and very high expenditure on health by others.

The following graph shows this inequality. The top 5% of spenders accounts for almost half of all health care spending in the US.


This graph should be read similarly to a Lorenz curve: the fact that the cumulative distribution of spending bends sharply away from the 45% degree line is a measure of high inequality (this is the intuition of the Gini coefficient that we discuss in our income inequality data entry). As it can be seen, the top 5% of spenders account for almost half of spending, and the top 1% account for more than 20%. Some concentration in expenditure is certainly to be expected when looking at the distribution across the entire population – because it is in the nature of healthcare that some individuals, particularly those older and with complicated health conditions, will require large expenditure –, these figures seem remarkably large and suggest important inequality in access.
But that is not what his second chart shows. The chart shows that some individuals receive a great deal of costly treatment. We do not know, however, whether they are also the people who are disproportionately wealthy. Roser has fallen into a tautological argument without evidence: If there is inequality in healthcare expenditure, it must be because of income inequality. Evidence of inequality in health expenditure is not by itself a function of income inequality (and vice-versa.) The compulsion to make data fit assumptions has overtaken the evidence.

Take the following as an example. A poor kid in Chicago is shot, as dozens are each week. The bullet decimates his liver and causes massive bleeding. The ER services in Chicago are excellent and they are accustomed to traumas you would never see anywhere but on a battlefield. All hospitals are obligated to serve regardless of capacity of the patient to pay. Access is not the problem that Roser posits. They swing into action and after 12 hours of surgery, stabilize the victim enough to hold until a liver transplant becomes available days or weeks later. The cost? Upwards of $500,000 or a million. This victim is a one percenter on Roser's chart even though his income might be in the bottom 5%.

The point is that income inequality has little or nothing to do with required medical services inequality.

What the chart does show is that the American medical system, with all its failures and successes, goes to extraordinary lengths to save lives and remediate bodies. We work harder and spend more to save dramatically premature babies, those suffering from traumatic injuries, the elderly, etc. Our aspiration to save everyone all the time is expensive. That's why we have a steep Lorenz curve.

In the chart above Like is not being compared to Like.

All the other comparison countries have at least three distinctions from the US. 1) The US is far more culturally and economically heterogeneous than any of the European countries (with each US group having markedly different attitudes and dispositions towards health, behaviors, etc.), 2) The US has a distributed system and the Europeans have centralized health systems, and 3) The US is richer than other countries (which increases the marginal propensity to spend).

In Europe, with centralized health systems, health expenditures are, simplistically, capped by the government. This tends to improve basic health at the expense of extreme health issues. Many European countries are well on their way to implementing Liverpool Protocol approaches to evaluating the cost of life maintenance. One could argue that the European approach is more rational but one could also argue that it is more immoral. Both would be at least partially true. Our American proclivity to expend vast amounts to save lives and even lost causes is noble but ruinously expensive. That is what Roser's second chart shows.

Europe has begun getting more heterogeneous immigration but compared to the US they still have very low volumes of immigration and low variability. The US healthcare system has to accommodate far greater variety in terms of behaviors, attitudes towards health, language, etc. That comes at a cost. In most fields of socioeconomic measure, when you track culture groups back to their originating countries/cultures/continents, the American groups almost always outperform in the culture-to-culture comparisons but not in the overall average (because we are averaging far more groups that are far more varied. This is the Wisconsin-Texas Paradox, also known as Simpson's Paradox.)

Americans of European heritage are richer and score better on international education tests (PISA) than do their European peers. Likewise with Asian-Americans against Asia, Hispanics against Central and South Americans, African Americans against Africa.

I am pretty confident that the same would be true for longevity.

The point of all this is that Roser's graph looks to make an especially compelling argument that there is a problem. The challenge is that the graph is not displaying like-for-like results and therefore ends up being misleading. You would have to normalize for age, for income, for cultural tradition, education attainment, etc. Once you do those normal things that should have been done in the first place, you end up with dramatically different root causes as to what is driving the cost differentials. We do have the option to force people to live healthier life styles through coercive government action if we wish to do so. We do have the option of saving government money on healthcare by rationing that care. Those are the European choices and perhaps they are worthwhile.

But they are not obviously so.

