Tuesday, March 19, 2013

US Education: Expensive and ineffective? Not so fast.

I came across an interesting cache of data which begins to answer a supposition I have held for a good while; that American schools are better than they are given credit for. In international educational comparisons which are conducted from time to time, the US usually comes out somewhere in the middle of the pack despite spending inordinately more money per student in both absolute dollars as well as in terms of percent of GDP. Having seen and experienced education in Europe, Asia and Australasia (and Africa) and hired people who were the product of education systems across the globe, I have long harbored the suspicion that those rankings were not telling the complete story. It is notoriously difficult to do international comparisons of performance in any field – health outcomes, productivity, crime, etc. There tend to be three sources of problems (other than outright gaming of the rankings) - 1) Definitions (ex. what constitutes a student?), 2) Ensuring like is being compared to like, and 3) Reliability and consistency of data collection. Even when these are done well and accurately, there is still a missing back-story and context.

The problems of ensuring comparability are easy to anticipate. What constitutes a 15 year old (by age, by grade, by birthday at time of test, etc.)? Are all 15 year-olds included? What about the 10% in private schools or parochial, what about home schoolers, what about charter, what about those who have already dropped out by age 15? Does it include all boys and all girls? How confident are we that the same test in different languages constitutes the same test? Etc. The issues will vary enormously by country.

My supposition has been that in addition to the known issues surrounding data normalization, there is also likely an aspect of the Wisconsin-Texas paradox as well ( see post, The Texas-Wisconsin Paradox and intergenerational income mobility). The US is unusual in many regards, particularly in that it is large, highly heterogeneous (by ethnicity, by culture, by religion, etc.), and very productive/innovative and has been so for long stretches of time. There is complexity that comes with these attributes and there are no real counterparts against which to compare it.

The OECD’s Program for International Student Assessment (PISA) has gained traction in the past couple of decades as an effort to compare international performance in education. There are three areas of performance measure, literacy, maths and science. The most recent reading assessment was performed in 2009 and the detailed results were released in Highlights From PISA 2009: Performance of U.S. 15-Year-Old Students in Reading, Mathematics, and Science Literacy in an International Context.

While I think the PISA effort is estimable, I believe we are still a few more cycles away from it being robustly reliable. I suspect that there are particular issues of Like-to-Like comparisons. For example, the most recent results show Shanghai leading the globe with an average reading score of 556 (reading scores are normed at 500). While Shanghai is an up and coming and wealthy city, I find this result implausible. It is a city of some 25 million with an inordinate percentage of the residents being emigrants from the countryside in the past twenty years, i.e. farmers drawn to the allure of manufacturing in the city. Everywhere in the world there is always a couple of generation lag in assimilation and education attainment in the migration from country to city. The PISA report would have us believe that uniquely Shanghai has broken this worldwide phenomenon. Perhaps, but more likely they have simply sampled only their best schools in the city.

Flawed as it might be, PISA will continue to be refined and the data will become more reliable over time. It has the advantage that it is now being administered to ever larger numbers of countries (sixty in this most recent survey) and is no longer restricted to the OECD.

So with all those caveats, what can the PISA scores tell us about US educational performance. I think, lurking in the data, there is a confirmation that the US does a far more effective job of educating its population than virtually anyone else in the world, despite the heterogeneity of its population and despite the very large percentage of its population that is first generation emigrant (approximately 12% now). Efficiency is a different, and no less important, an issue. However, I think perhaps one half of the accusation of ineffective and inefficient can be taken off the table.

There is an argument that when you have a heterogeneous population, that like has to be matched to like for comparison purposes. Ideally, as the primary vector for difference, one would look at self-identified cultural heritage (or religion as a proxy) as the source of differences in performance. Since that data is not easily or reliably available, an alternative is to use race as a proxy. Compare US non-Hispanic whites to western Europeans (as the primary cultural stock), Hispanic Americans to Latin American countries, Asian Americans to eastern Asia countries and Black Americans to African countries. Lest there be confusion, this is not an assertion that there is biological determinism at work. I believe that that view is well dispatched in such accessible accounts as Steve Olson’s Mapping Human History: Genes, Race, and Our Common Origins. Rather it is a recognition that an individual’s cultural framework has an inordinate influence on personal choices and actions and that culture, while mutable, can be extraordinarily long lived. While all assimilated Americans will have greater behavioral affiliation with one another than with virtually any other country, each sub-group within the US has strong traceable influences from their ancestral cultural origins.

So how do US students compare against their respective cultural cohorts across the globe? Contrary to most characterizations, pretty well.



Click on the above chart to enlarge and then click a second time to enlarge once more. There is a lot of information crammed into a small space. East Asian countries are light green. European countries (along with Australia, New Zealand and Canada), dark blue. Islamic countries, dark green. Latin American countries are light blue. Israel is mauve. Trinidad and Tobago is orange. T&T’s population is one third sub-continental Indian and one third of African ancestry with the balance of mixed ancestry. There are no African countries that participated in PISA 2009 and so T&T is used as the closest proxy. The solid red bar is the average American performance, 500, on reading. Right at average. The green bar with a heavy red border is the breakout for Asian Americans (541). The dark blue bar with heavy red border is non-Hispanic white Americans (535). Light blue with heavy red border is Hispanic American (466) and orange with heavy red border is USA Black (441). There are no available comparisons for Native Americans.

Omitting Shanghai for the reasons mentioned above, what this chart shows is that Asian American 15 year-olds read better than all other east Asian students, with the Asian American score of 541 versus the average for all other East Asia of 517. South Korean students are a close run second with a score of 539.

White American 15 year-olds (535) outscore all European countries except Finland (536). The average of all European countries is 483. The average for just the most developed countries in Europe (basically northwestern Europe, omitting Eastern Europe and the Mediterranean), is 503.

Hispanic American 15 year-olds (466) outscore all Latin American countries. The closest is Chile (449). The average of all Latin American countries is 409.

There are no African countries that participate in PISA. However, there are two countries with a very large percentage of their populations deriving from Africa. Trinidad and Tobago might be used as a proxy with a population that is 35% African heritage and 25% mixed African heritage. With that as a base for comparison, US African American 15 year-olds score 441 to Trinidad and Tobago’s 416. Brazil also has a 50% black or mixed heritage population and its PISA reading score was 412.

With all the caveats of data integrity still in place, what we can say is that American subgroups outperform all their global cultural peers save for the Finns.

Just out of curiosity, I ran a number of other comparisons. If you compare average scores by continent, there is only a 70 point difference between highest and lowest. Likewise with the very closely correlated consideration of race (a 68 point difference between highest and lowest). Interestingly there is only a 14 point difference between countries if you look at the quintiles by size. The supposition might have been that smaller countries might be able to educate more effectively than much larger, presumably more complex countries. In fact the top quintile of largest countries outscored the quintile of smallest countries by 14 points.

If you look at the countries by quintiles of PPP per capita GDP (top quintile being the richest), Quintiles 1, 2 and 3 were virtually identical with scores of 488, 501, and 489 respectively). The drop-off occurred at Quintiles 4 and 5 with scores of 437 and 393 respectively.

The final lens was to look at outcomes by religious tradition (which can be done with the PISA numbers but not the USA subgroups). This had the largest dispersion of results from highest to lowest, supporting the contention that religion is something of a macro proxy for cultural values and behaviors. The results were as follows.

Confucian 525
Protestant 501
Jewish 474
Christian 473
Roman Catholic 456
Eastern Orthodox 441
Buddhist 421
Hybrid 416
Islamic 396

So what does all this data tell us? Culture is a powerful forecaster of PISA scores is one observation. Wealth of a country has virtually no forecasting value above some minimum level.

I think the biggest take-away is that as variable, localized, distributed and frequently flawed as the US system of education might be, it is, within the context of the varied cultural traditions with which it has to work, producing world-class results.

It is also, arguably, better at achieving more equitable results. For the three groups for which there are sufficient data points (Asia, Europe, Latin America), the US correspondent groups are much closer to the national average than the results for each of the heritage groups. In other words, while there are still significant performance gaps between US cultural groups, each of those groups is more equal in performance to the national average than the gap between the highest and lowest of the heritage groups. For example white American students score 35 points above the USA national average. However, for Western European countries, there is a range of 66 points between highest and lowest scores.

Summary of observations
• Culture is the most reliable variable for predicting PISA scores.
• The US educates its people within their respective cultural traditions to a level that exceeds all their corresponding cultural peers (except for Finland in the case of white Americans)
• The US achieves a more equitable performance between cultural groups compared to the benchmark heritage groups.
• Above a low level, national wealth as a predictive variable has almost no value for forecasting PISA scores.
This is quite a different picture than is usually represented in the summary condemnations. Yes we spend a disproportionate amount of money on educating our people but we also produce exceptional results. This data allows us to home in on the critical root causes of performance differential – cultural values and attributes. There is much work to be done in order to close inter-group performance differences, there is much efficiency that needs to be wrung out of the system, and the task of educating for comparable results between different cultural traditions is fraught with contention. Yes, yes, and yes. But let’s not lose sight that there are some significant achievements here as well.

With regard to closing achievement gaps and the challenge represented by educating to achieve comparable outcomes across different cultural traditions, it might warrant examining the experience of Singapore. Yes they are tiny and yes they have the lowest birthrate in the world (which creates distinct economic, sociological and educational issues), but they also have a materially multicultural populace (75% Chinese, 15% Malay and 10% Indian) as well as among the highest reading scores in the world, 526. If their score is indeed representative of their entire 15 year-old population, then what is it that they have done to achieve such high scores across three very different cultural traditions, and what might we learn from that?

No comments:

Post a Comment