Wednesday, August 5, 2015

Motivated research finds the gender results it was seeking, despite all the contrary evidence

I really like the hard work that Scott Alexander does. There was a fairly absurd research paper a few months ago, Expectations of Brilliance Underlie Gender Distributions Across Academic Disciplines by S. Leslie, et al. Essentially the claim was that women were underrepresented in fields where it was assumed that innate capability was the biggest predictor of success. The implication was that societal assumptions were precluding women from participating equally in all fields. I filed this under cognitive pollution primarily because the study was using a fairly removed proxy for a variable that could be directly measured. It doesn't matter what a person believes it takes to be successful, it matters what it actually takes to be successful.

Alexander puts it this way.
Okay. Imagine a study with the following methodology. You survey a bunch of people to get their perceptions of who is a smoker (“97% of his close friends agree Bob smokes”). Then you correlate those numbers with who gets lung cancer. Your statistics program lights up like a Christmas tree with a bunch of super-strong correlations. You conclude “Perception of being a smoker causes lung cancer”, and make up a theory about how negative stereotypes of smokers cause stress which depresses the immune system. The media reports that as “Smoking Doesn’t Cause Cancer, Stereotypes Do”.

This is the basic principle behind Leslie et al (2015).

The obvious counterargument is that people’s perceptions may be accurate, so your perception measure might be a proxy for a real thing. In the smoking study, we expect that people’s perception of smoking only correlates with lung cancer because it correlates with actual smoking which itself correlates with lung cancer. You would expect to find that perceived smoking correlates with lung cancer less than actual smoking, because the perceived smoking correlation is just the actual smoking correlation plus some noise resulting from misperceptions.

So I expected the paper to investigate whether or not perceived required ability correlated more, the same as, or less than actual required ability. Instead, they simply write:
Are women and African-Americans less likely to have the natural brilliance that some fields believe is required for top-level success? Although some have argued that this is so, our assessment of the literature is that the case has not been made that either group is less likely to possess innate intellectual talent1.
That last quoted paragraph was what set my cognitive pollution antenna tingling. "Are women and African-Americans less likely to have the natural brilliance that some fields believe is required for top-level success?" That is the central question they are trying to answer but they are doing it via proxies of perception. They airily dismiss the central question with a blithe wave of the "appeal to authority" hand. But it isn't any actual authority, it is simply a trust me statement, there's nothing to see here kind of statement "our assessment of the literature is that the case has not been made." Well, no. The case isn't made unless you actually make the effort to make it. The good scientist tests all the plausible hypotheses not just the one's he or she prefers to be true.

And that is what Alexander does. He goes in search of the actual empirical data to answer the central question and is able to do so in the affirmative. His article provides the details but the upshot is that
There is a correlation of r = -0.82 (p = 0.0003) between average GRE Quantitative score and percent women in a discipline. This is among the strongest correlations I have ever seen in social science data. It is much larger than Leslie et al’s correlation with perceived innate ability.
In the vernacular, the more mathematical capability is required in a field, the fewer women are involved. And it has nothing to do with women being screened out. It has to do with fewer women scoring at the mathematical aptitude levels for those fields.
Mathematics unsurprisingly has the highest required GRE Quantitative score. Suppose that the GRE score of the average Mathematics student – 162.0 – represents the average level that Mathematics departments are aiming for – ie you must be this smart to enter.

The average man gets 154.3 ± 8.6 on GRE Quantitative. The average woman gets 149.4 ± 8.1. So the threshold for Mathematics admission is 7.7 points ahead of the average male test-taker, or 0.9 male standard deviation units. This same threshold is 12.6 points ahead of the average female test-taker, or 1.55 female standard deviation units.

GRE scores are designed to follow a normal distribution, so we can plug all of this into our handy-dandy normal distribution calculator and find that 19% of men and 6% of women taking the GRE meet the score threshold to get into graduate level Mathematics. 191,394 men and 244,712 women took the GRE last year, so there will be about 36,400 men and 14,700 women who pass the score bar and qualify for graduate level mathematics. That means the pool of people who can do graduate Mathematics is 29% female. And when we look at the actual gender balance in graduate Mathematics, it’s also 29% female.
That's why I like much of Alexander's work. He knows he is treading in delicate areas and he is usually quite conscientious in being mindful. But he is first and foremost interested in what the logic and the evidence tells him (and us).

This is, in my view, both the right mindset and overwhelmingly critical. If we do not understand root causes, we cannot make informed trade-off decisions. Leslie, et al would have us chasing off trying to figure out how to change people's perceptions. Perceptions aren't the problem though. Abilities are the problem. That is a dramatically different kettle of fish.
Vast rivers of ink have been spilled upon the question of why so few women are in graduate Mathematics programs. Are interviewers misogynist? Are graduate students denied work-life balance? Do stereotypes cause professors to “punish” women who don’t live up to their sexist expectations? Is there a culture of sexual harassment among mathematicians?

But if you assume that Mathematics departments are selecting applicants based on the thing they double-dog swear they are selecting applicants based on, there is literally nothing left to be explained.

[snip]

But on the whole, the prediction is very good. That it is not perfect means there is still some room to talk about differences in stereotypes and work-life balance and so on creating moderate deviations from the predicted ratio in a few areas like computer science. But this is arguing over the scraps of variance left over, after differences in mathematical ability have devoured their share.
Alexander is frustrated with the cognitive pollution, so heedlessly launched into our epistemological environment.
A reader of an early draft of this post pointed out the imposingly-named Nonlinear Psychometric Thresholds In Physics And Mathematics. This paper uses SAT Math scores and GPA to create a model in which innate ability and hard work combine to predict the probability that a student will be successful in a certain discipline. It finds that in disciplines “such as Sociology, History, English, and Biology” these are fungible – greater work ethic can compensate for lesser innate ability and vice versa. But in disciplines such as Physics and Mathematics, this doesn’t happen. People below a certain threshold mathematical ability will be very unlikely to succeed in undergraduate Physics and Mathematics coursework no matter how hard-working they are.

And that brought into relief part of why this study bothers me. It ignores the pre-existing literature on the importance of innate ability versus hard work. It ignores the rigorous mathematical techniques developed to separate innate ability from hard work. Not only that, but it ignores pre-existing literature on predicting gender balance in different fields, and the pre-existing literature on GRE results and what they mean and how to use them, and all the techniques developed by people in those areas.

Having committed itself to flying blind, it takes the thing we already know how use to predict gender balance, shoves it aside in favor of a weird proxy for that thing, and finds a result mediated by that thing being a proxy for the thing they are inexplicably ignoring. Even though it just used a proxy for aptitude to predict gender balance, everyone congratulates it for having proven that aptitude does not affect gender balance.

Science journalism declares that the myth that ability matters has been vanquished forever. The media take the opportunity to remind us that scientists are sexist self-appointed geniuses who use stereotypes to punish women. And our view of an important issue becomes just a little muddier.
Contrary to Leslie's paper and to the media reports, there is no evidence here of biases or discrimination arising from stereotypes. What there is already evidence for is that there are gender based variations in a variety of abilities and that our current academic processes are effectively matching people based on both ability and commitment (effort and perseverance). Exactly the opposite of what was reported.

No comments:

Post a Comment