Thing Finder: The only bias discovered in this study is confirmation bias

From A Study Used Sensors to Show That Men and Women Are Treated Differently at Work by Stephen Turban, Laura Freeman, and Ben Waber. It would be valid to say that the only bias discovered in this study is confirmation bias.

For all that their conclusion is unsupported by their data, it is none-the-less an interesting effort.

We know from many replicated studies that there is no gender wage gap for equal work. Men and women are paid the same for the same work here in the US and across most the OECD, a finding of some thirty years standing and frequent validation and replication. The factors which determine compensation have to do with education attainment, degree field, economic sector, work flexibility, work commitment, work duration, work structure (full-time, part-time, etc.), etc. When comparing like-to-like there are identical outcomes. Differences only occur when you compare apples to oranges (for example, comparing earnings for full-time employees with those of part-time employees.)

But Turban et al are not investigating compensation claims, they are trying to explain why women are under-represented at senior levels of enterprises. It is regrettable that they appear not to be aware of the research in wage gap theory nor aware of the labor force participation structure between countries as that might have provided many of the answers to their question.

They start, kudos to them, with trying to generate data that might answer their question.

Gender equality remains frustratingly elusive. Women are underrepresented in the C-suite, receive lower salaries, and are less likely to receive a critical first promotion to manager than men. Numerous causes have been suggested, but one argument that persists points to differences in men and women’s behavior.

Which raises the question: Do women and men act all that differently? We realized that there’s little to no concrete data on women’s behavior in the office. Previous work has relied on surveys and self-reported assessments — methods of data collecting that are prone to bias. Fortunately, the proliferation of digital communication data and the advancement of sensor technology have enabled us to more precisely measure workplace behavior.

We decided to investigate whether gender differences in behavior drive gender differences in outcomes at one of our client organizations, a large multinational business strategy firm, where women were underrepresented in upper management. In this company, women made up roughly 35%–40% of the entry-level workforce but a smaller percentage at each subsequent level. Women made up only 20% of people at the fourth level (the second highest at this organization).

We collected email communication and meeting schedule data for hundreds of employees in one office, across all five levels of seniority, over the course of four months. We then gave 100 of these individuals sociometric badges, which allowed us to track in-person behavior. These badges, which look like large ID badges and are worn by all employees, record communication patterns using sensors that measure movement, proximity to other badges, and speech (volume and tone of voice but not content). They can tell us who talks with whom, where people communicate, and who dominates conversations.

We collected this data, anonymized it, and analyzed it. Although we were not able to see the identity of individuals, we still had data on gender, position, and tenure at the office, so we could control for these factors. To retain privacy, we did not collect the content of any communications, only the metadata (that is, who communicated with whom, at what time, and for how long).

Given the question they are trying to answer, there appear to be some gaps in their study design.

Four months is too short a time frame. They are working with a consulting firm in this study. There are surges and lulls in consulting firm work volumes. Did this study occur during a surge, during a lull or was it representative of the full year?

If their study was executed during a lull, then they are not capturing real determinants of what makes a difference in terms of promotion. They find that male and female employees have the same mentorship, leadership exposure rates, and networking rates. But only for this four month period. It is quite possible that these rates might differ significantly if the study was conducted in an intense period rather than a slack period.

We don't know if that is the case but we can't address the potential weakness in the study design because those details are not provided.

The researcher's description of the design also leaves unaddressed one of the critical differentiators in performance - volume and flexibility of response to work conditions. We know from the wage gap research that this a key differentiator. People working nine to five versus those working 60-80 hours a week and, as critically, doing it on short notice and with the flexibility to do it evenings and weekends, end up with substantially different performance outcomes.

If someone working nine-to-five has the same exposure rates, affiliation rates, mentorship rates, etc. as someone working crazy and responsive hours, you would expect the second to prosper over the first. Perhaps they took into account overtime, evening and weekend hours, but we don't know that from the study description.

The authors had several hypotheses to explain the difference in career achievement.

We went in with a few hypotheses about why fewer women ended up in senior positions than men: Perhaps women had fewer mentors, less face time with managers, or weren’t as proactive as men in talking to senior leadership.

Interesting that they did not have the hypotheses which explain differences in wage gaps: work duration, work flexibility, degree attainment and degree field, career field, etc.

Their findings are interesting.

But as we analyzed our data, we found almost no perceptible differences in the behavior of men and women. Women had the same number of contacts as men, they spent as much time with senior leadership, and they allocated their time similarly to men in the same role. We couldn’t see the types of projects they were working on, but we found that men and women had indistinguishable work patterns in the amount of time they spent online, in concentrated work, and in face-to-face conversation. And in performance evaluations men and women received statistically identical scores. This held true for women at each level of seniority. Yet women weren’t advancing and men were.

The hypothesis that women lacked access to seniority, in particular, had little support. In email, meeting, and face-to-face data, we found that both men and women were roughly two steps, or social connections, away from senior management (so if John knows Kate and Kate knows a manager, John is two steps from a manager).

Some have argued that women lack access to important, informal networks because they don’t reach out to or spend time with “the boys club.” But this didn’t hold up in our data. We found that the amount of direct interaction with management was identical between genders and that women were just as central as men in the workplace’s social network. The metric we used for this is called weighted centrality. Centrality can be thought of, at a simple level, as how close someone is to decisions being made, other employees, and the other “power connectors,” or individuals with a high number of contacts. Weighted centrality takes into account how much time employees spent talking to different people, which we used as a proxy for how strong the relationship is.

I think these findings (caveated by the design criticisms) are quite worthwhile. They suggest that companies are in fact delivering equitable work environments to women and men. They are ensuring that men and women have equal access to leadership, to mentorship, to work environments.

Especially interesting to me, as I have been highly critical of this explanation for many years, is the refutation of the "old boys club" hypothesis. Obviously those exist in some companies in some places at some times, though I would characterize them more as insider clubs, rather than as boys clubs.

But in a thirty odd year career nationally and internationally with roles including CEO of global businesses and providing services to Fortune 500 companies, I have never been a member of an old boys club nor seen such a beast in the wild. Turban et al's study seems to confirm that the old boys club is indeed a myth.

Turban makes the claim:

Our analysis suggests that the difference in promotion rates between men and women in this company was due not to their behavior but to how they were treated.

But this is a conclusion not supported by the evidence. What they have shown is that men and women have equal networking opportunities, exposure and connections. They don't show that they have the same behaviors and, nor by the design of the study, could they show that the same behaviors were being demonstrated.

They have concluded that because men and women demonstrate the same social patterns within an enterprise, they must be demonstrating the same behaviors. That is not a conclusion that can be supported by the evidence. Confidence, flexibility, adaptability, resilience, etc. are all important behaviors. Are they equally demonstrated between men and women in enterprises? From this study, we don't know.

The paragraph from which the above quote is drawn is the pivotal paragraph demonstrating the researchers departure from evidence and logic into confirmation bias. The full paragraph is:

Our analysis suggests that the difference in promotion rates between men and women in this company was due not to their behavior but to how they were treated. This indicates that arguments about changing women’s behavior — to “lean-in,” for example — might miss the bigger picture: Gender inequality is due to bias, not differences in behavior.

That is a huge crevice they are leaping. Work patterns are not the same as behaviors. More than that, they already have indicated that "in performance evaluations men and women received statistically identical scores" which would seem to indicate that at least some level, that women are being treated the same as men.

Between design flaws, confusion about work patterns versus work behaviors, and contradicting evidence within the study, there is no support for the conclusion that gender inequality in promotion rates is due to bias. The conclusion is further undermined by the evidence from gender wage gap research which finds that such gaps exist due to women's decisions (parenthood, degree major, work duration/commitment, etc.).

The study is badly designed to answer the question they asked, they ignore the study results that counter their assumptions, they confuse patterns of work with patterns of behavior, and they fail to take into account research from adjacent fields (economics) which demonstrate alternate explanations are operative.

From that perspective, the study is terrible. On the other hand, they do demonstrate that, at least for this company, HR has been effective in ensuring that men and women have equal opportunities for leadership exposure, network creation and mentorships. They also demonstrate that, at least in this company, there is no old boys club standing in the way of women.

Interestingly, at the pivotal paragraph above, the language becomes much weaker. They are firm in their conclusion but they are weak in the evidence for it. I have bolded the weasel words necessary to support their preconceived explanation which they are trying to push.

Bias, as we define it, occurs when two groups of people act identically but are treated differently. Our data implies that gender differences may lie not in how women act but in how people perceive their actions.

There is no evidence that this is happening, it is the conclusion they are drawing without support.

This passage is fascinating for its lack of self-awareness.

At this company, women tend to leave the workforce between the third and fourth level of seniority, after having been at the company for four to 10 years. This timing presents another possible hypothesis: Perhaps women decide to leave the workplace for other reasons, such as wanting to raise a family. Our data can’t determine whether this is true or not, but we don’t think this changes the argument for reducing bias.

In wage gap research, family commitments and their impacts on work volume and work flexibility, have a huge impact on wage outcomes. Decisions about when to start a family and how to structure familial roles and work commitments are highly significant in determining outcomes. There is no reason to expect that it would be any different in terms of promotion rates. The timing of women's departures seems to suggest that family commitments as a hypothesis is at play here as well.
But the Turban team never engages with this Occam's Razor.

They ignore the ready explanation (family commitments) and acknowledge that their research can't answer the question as to whether family commitments might explain female attrition. They then pull a sleight of hand by changing the argument to whether it is worthwhile to reduce bias.

Which is odd since their own research in terms of how women are connected and perform seems to indicate that at this firm, there is no bias.

The rest of the article is intellectually downhill. Having manufactured an explanation of bias where there is no bias in evidence, the Turban team resorts to the failed policies of the past. They want to see affirmative action in gender promotions to take the place of individual performance and choices.

This means trying bias-reduction programs, but also developing policies that explicitly level the playing field. One way to do so is to make promotions and hiring more equal.

The fastest way to create or reinforce biases is to sustain variances in performance through affirmative action.

At this point, the Turban team fall back on the hypothesis that wage gap research suggests is the rub - personal familial choices.

Another potential problem lies in workload. In this company, we measured higher workloads as individuals advanced to higher levels of seniority. This isn’t intrinsically gendered, but many social pressures push women around this age to simultaneously balance work, family, and a disproportionate amount of housework. Companies may consider how to modify expectations and better support working parents so that they don’t force women to make a “family or work” decision.

This seems the big reveal. Gender wage gap research indicates that women exit full-time and highly accommodating work commitments (i.e. long sustained hours on short notice) as they begin to have children. Apparently that is what is driving attrition at this company as well. Nothing in the Turban team's research suggest there is bias and they are apparently willing to acknowledge that conflicts and workload is what is driving attrition and the gender imbalance at senior levels.

But that doesn't stand in the way of concluding that bias is the answer. Bias, apparently, was always going to be the answer from this study. The only bias discovered in this study is confirmation bias.

Thing Finder

Thursday, October 26, 2017

The only bias discovered in this study is confirmation bias

No comments:

Post a Comment

Search This Blog

Links

Blog Archive

About Me

Blogs