Thing Finder: The ruins of our early hopes block us from new approaches

From Does Pre-K Really Hurt Future Test Scores? Here's what’s behind that new Tennessee study by Emily Oster.

There was indeed a new paper out and, as with so many in the past decade, it did not have good news regarding the value of pre-k education. From the Abstract:

As state-funded pre-kindergarten (pre-K) programs expand, it is critical to investigate their short- and long-term effects. This article presents the results through sixth grade of a longitudinal randomized control study of the effects of a scaled-up, state-supported pre-K program. The analytic sample includes 2,990 children from low-income families who applied to oversubscribed pre-K program sites across the state and were randomly assigned to offers of admission or a wait list control. Data through sixth grade from state education records showed that the children randomly assigned to attend pre-K had lower state achievement test scores in third through sixth grades than control children, with the strongest negative effects in sixth grade. A negative effect was also found for disciplinary infractions, attendance, and receipt of special education services, with null effects on retention. The implications of these findings for pre-K policies and practices are discussed.

Basically, no good results from the intervention. This is consistent with much of the evidence accumulating in the past five decades since the Abecedarian Project (n=57) and HighScope (Perry Project) (n=127) from the late sixties and early seventies. In those two early studies, with high investment, high motivation, low controls, etc. it was found that high quality pre-k intervention could improve academic results and social outcomes.

With those studies as a catalyst, there followed a massive search for the El Dorado of scalable positive outcomes ranging from Head Start to State programs (Tennessee and Oklahoma most recently, along with Quebec in Canada). It seemed like an early investment in program money might raise IQs, raise academic performance and overall improve life outcomes.

In all later reviews, it has been a common finding that the intervention works for a year, then participants revert to their otherwise expected academic outcomes. As it became clear that early intervention did not raise IQs and did not improve academic performance, the race was on to find ancillary benefits such as improved graduation rates and reduced disciplinary issues.

It is worth noting that this new study is a follow-up on an earlier study of the Tennessee participants at grade level three which found the now to-be-expected reversion of academic outcome and poor non-academic measure outcomes. This study confirms the earlier findings but at grade six.

When this most recent study came out Freddie De Boer's assessment was Shovel More Dirt on Pre-K.

So I would ordinarily shy away from doing an old-school blog post that simply links to something else, but this feels like a study that calls out for an exception. I’ve just been reading a paper in the journal Developmental Psychology¹. It’s a pre-K study that has many virtues, including

Large n (2990 kids)

Genuine random assignment

Longitudinal design

Confirms my priors

… and it says kids who were assigned to the pre-K condition actually did worse than kids who were not.

Pre-K advocates tend to fixate on non-academic indicators as a way to justify pre-K programs. But attendance was mildly worse for the pre-K group:

Attendance rates in sixth grade (proportion of instructional days without a recorded absence) were high for both TN-VPK participants and nonparticipants. Nonetheless, the difference between groups was statistically significant with a slightly higher rate for nonparticipants (97.5% vs. 97.1%, p = .013 for the ITT analysis with observed values). Supplemental Table S11 provides model details for each year (see also Supplemental Figure S3). Sixth grade was the first academic year with a significant attendance difference between conditions, although there were marginally significant effects in kindergarten and first grade.

Nor can we find any solace in disciplinary action:

So, yeah. Looks bad! Random assignment to condition for a large and representative n studied longitudinally shows kids who got pre-K do meaningfully worse than those who didn’t.

“Beware the man of one study!” you might say. But, well, we have many more studies than one showing this outcome, now. I just wrote about it not that long ago: the pre-K research record is most optimistically described as mixed and most realistically described as discouraging. I’ve been writing about it for a long time, actually, and the song remains the same. This chart is out of date at this point, but the overall trend is still informative. There are always exceptions and there are examples touted as proof that pre-K works. But the drift from the initial positive studies to more pessimistic later studies seems clear, from where I’m sitting, and the most compelling and parsimonious explanation is that we’ve gotten better at doing this research over time, with better study designs and higher data quality. The results are what they are. But liberals are forever looking for magic bullets in education, and a lot of them got very professionally, politically, and emotionally invested in pre-K, and it’s just really hard to get them to confront all the bad news.

Which is pretty much how I see it as well. Good study, negative outcomes, consistent with other good studies.

So I was quite surprised to see Does Pre-K Really Hurt Future Test Scores? Here's what’s behind that new Tennessee study by Emily Oster.

It is a generally accepted view that access to pre-K programs is good for children, which is why when a study came out a couple of weeks ago showing worse outcomes for children enrolled in a particular pre-K program in Tennessee, people kind of panicked. I can usually tell from the number of panicked emails and Instagram DMs how urgent it is to discuss a particular result. This didn’t quite reach the “toxic baby metals” level, but it was getting there.

Before getting into the study, it is important to set the stage. This is just one piece of a broad academic literature on this topic. There is a tendency sometimes to cover research like this as if it is either the first information we have about the question or as if it is inherently better because it’s new. Neither is necessarily true, so it’s always useful to step back and ask what the context is.

Overall, the academic literature tends to find positive impacts with pre-K programs. This goes back to well-known early evaluations of the Perry Preschool and Abecedarian Project that showed gains to children from early childhood education. The literature has evolved from those early studies (see a long review here) to support the value of Head Start and Head Start–like programs. Studies — like this one, of universal pre-K in Boston — have shown long-term impacts on high school graduation, college-going, and SAT test-taking.

This general consensus isn’t to say that every paper finds the same thing or that every result is positive. The most consistent finding is that pre-K raises kindergarten readiness. Effects on later test scores tend to be lower or zero. And yet in some cases, like the Boston data, there seem to be these positive impacts on educational outcomes even without corresponding impacts on test scores. In summary: the landscape of the literature on pre-K programs is broadly supportive, but it’s not always a straight-line analysis. (By the way, if you are looking for a wonkier run through the literature, Noah Smith also wrote on the topic this week.)

I cannot make much sense of this given the progression of studies over the past two decades, each finding that there is no lasting academic impact and highly disputed non-academic beneficial outcomes. Oster spends some time on a couple of methodological critiques but ultimately concedes that the results seem to be as represented. She falls back on the fact that there are other studies which show some positive outcomes.

On balance though, and across all the studies which have been produced (thousands), this seems really weak tea. I also set store by the fact that the larger and the more methodologically rigorous the study, the less effect (usually null) is found.

We want pre-k to be a silver bullet and yet it simply doesn't seem to achieve the outcomes everyone has so long hoped for.

I have no ready explanation for Oster's rosy assessment. It seems her position is

Pre-k makes the transition to Kindergarten easier (not a justifying goal)

Pre-k effects on later test scores tend to be low or zero (most studies in past decade or two concur)

Pre-k might have long-term impacts on high school graduation, college-going, and SAT test-taking (studies in past two decades are mixed)

Most these programs were undertaken with the expectation that academic performance would improve and it seems everyone concedes that does not happen. If you move the goal posts to justify these programs on non-academic measures (which may be desirable), you are now justifying an expense on different grounds. And while some studies have found non-academic improvements, those weren't the focus of investigation and the research is still quite mixed.

Oster references a piece by Noah Smith on the same topic, Pre-K is day care.

Smith concedes it is a good study with bad findings. And then resorts to a desperate Hail Mary.

The study above is a very good one. It’s a very large-sample study. It uses random assignment (because the students it follows got into the pre-K programs via lottery). It focuses on low-income students, who are exactly the people who need the most help from programs like this. And it follows the kids for a very long time, in order to see long-term effects.

But at the end of the day, even a very good study is just one study. Programs differ, regions and populations differ, and the set of additional policies being enacted to help kids also differs from place to place. All these things can cause the results of various studies to be very different. So you really have to look at a whole bunch of studies, of a whole bunch of programs, in a whole bunch of places, in order to get a holistic idea of what the evidence really says.

Indeed, that is why we do these studies. And the larger the study, and the more rigorous, the more likely it is that there is zero academic benefit and negligible/debatable non-academic benefit.

Smith does go through a lengthy review of the literature.

The upshot is that really high-quality pre-K programs do provide some educational benefit. But for the kind of mass-market pre-K programs that Biden’s plan would involve, the educational benefits are probably close to nil, and for many kids are probably negative.

Smith's is actually a pretty good literature review, though I think over-optimistic. He settles on a couple things I thing most followers of the literature agree on. The larger and more rigorous the study, the lower or null the findings. The issue is less to do with pre-k programs per se than it is the capacity to scale (i.e. high investment, high motivation interventions may show real results but they are too expensive and too selective to scale to the population as a whole.)

He comes to the conclusion that pre-k as implemented is effectively day care. I do not disagree. If we were to treat these as daycare programs, we could strip away the education industry costs and protocols which overlay a lot of costs. We don't need the costs if they cannot deliver the academic results.

Smith ends up in the position of promoting well-designed pre-k day care programs as valuable to both children (stability) and to parents (in terms of employability).

I don't disagree, but we are now in a different research field. Does quality day care have a material and lasting positive life outcome impact? I don't know. I have never done an in-depth review. Part of the challenge is that pre-k educational programs are so often blurred with pre-k day care. My impression is that the answer could be yes there might be some positive impact but that we probably see the same issue of scalability. I just don't know.

The largest day care program of which I am aware is that in Quebec. It was undertaken explicitly for economic reasons, to make it easier for women to confidently be able to return to the workforce relying on cheap and adequate day care. It is my understanding, from a couple of assessments done on the Quebec program, that it was successful from an economic perspective with a material increase in labor force participation rates among women and a decline in childhood poverty.

On the other hand, the studies I have seen also seem to find measurable declines in non-academic measures for the children. Measuring the Long-Term Effects of Early, Extensive Day Care by Jenet Erickson reports that (emphasis added)

But, as research on the Quebec program found, the highest level of quality can be hard to secure. In 2005, 60% of the universal day care program sites in Quebec were judged to be of “minimal quality.” Just one-quarter of the sites provided care that met the standards necessary to qualify as good, very good, or excellent. Such findings are comparable to many other developed countries, confirming just how challenging it is for children to have access to the quality of care necessary for persistent positive gains over the long run. And these small gains have to be weighed against the risks of spending extensive hours in day care. The comprehensive evaluation of day care quality done in the NICHD-SECC found that extensive hours in day care early in life predicted negative behavioral outcomes throughout childhood and in to adolescence, even after controlling for day care quality, socioeconomic background, and parenting quality.

But that is one finding among many.

I think we just don't know yet.

I think Smith's focus on the day care aspect warrants attention but Arnold Kling makes a very salient point. Nothing is free. Kling:

I think he [Noah Smith] is making a really fundamental error here. His argument seems to confuse “free” with “zero cost.”

Suppose that day care is worth $20,000 a year per child, whether I provide it or I hire a child care worker. If I can earn $25,000 in the market, and I don’t get any additional utility or disutility from staying home, then I can hire a day care provider and go to work. There is no need to subsidize day care for me.

If I can only earn $15,000 in the market, then I should stay home and provide day care instead. But Noah wants me to enjoy “free” day care, in which case I will go to work. That seems to confuse “free” with “no cost.” My “free” day care will cost taxpayers $20,000, which is less than the value that society puts on my labor.

In reality, there are tax distortions at work. If I provide day care myself, my labor is not taxed. That skews my decision toward staying home. I can see trying to get rid of the tax distortion. But providing “free” day care is not the way to do it.

This goes to the issue of mis-allocation of resources. If we are spending $20,000 (in day care) for someone to return to the labor force where they earn $15,000, is that a good societal investment? Might it be better spent on education or health or income subsidies or any of a number of other programs?

There are four responses here to one new study about pre-k education programs. One Marxist (De Boer), two left liberals (I assume) (Oster and Smith) and one libertarian (Kling). All four intelligent and reasonably knowledgeable in this area. All four come to rather notably different conclusions.

I am reasonably confident that pre-k for educational achievement purposes is a bust. Absent some major rigorous study or surprising new approach, it seems like we have scraped that barrel and there is nothing there.

I suspect the same might be true for universal day care though I am materially less confident in that answer. What would be interesting to see is some study which addressed quality of care to ensure that we are measuring results from programs which we would expect to work and then a longitudinal study of such a program (preferably at some scale.)

My suspicion is that, as with pre-k education, there will be null or negative results for universal day care if we don't control for scaled quality. I also suspect that we may find that even if there are positive results, that the scaled quality may simply be unaffordable.

Which still leaves Kling's excellent question in play. Compared to what? Instead of provide government programs in terms of day care and pre-k education, perhaps the focus ought to be on exploring and developing programs which might stabilize childhood environments for the most vulnerable.

It seems like children under age 6, and especially boys, are particularly at risk of negative educational and behavioral outcomes when there is high instability due to economic factors and due to family dysfunction. If those issues could be addressed or mitigated, that might actually drive better educational and non-academic life outcomes.

Right now we are fifty years into hoping that educational pre-k works and we still do not have a strong affirmative answer and seem unlikely to get one. Similarly with day care but that is markedly less intensively studied.

It is probably time to explore alternatives but we can't get there if we are still shackled to those unfulfilled hopes.

Thing Finder

Friday, February 4, 2022

The ruins of our early hopes block us from new approaches

No comments:

Post a Comment