Gender differences and bias in open source: Pull request acceptance of women versus men

Josh Terrell¹, Andrew Kofink², Justin Middleton², Clarissa Rainear², Emerson Murphy-Hill ², Chris Parnin², Jon Stallings³

July 26, 2016

A peer-reviewed article of this Preprint also exists.

View peer-reviewed version

Author and article information

Abstract

Biases against women in the workplace have been documented in a variety of studies. This paper presents the largest study to date on gender bias, where we compare acceptance rates of contributions from men versus women in an open source software community. Surprisingly, our results show that women's contributions tend to be accepted more often than men's. However, women's acceptance rates are higher only when they are not identifiable as women. Our results suggest that although women on GitHub may be more competent overall, bias against them exists nonetheless.

Cite this as

Terrell J, Kofink A, Middleton J, Rainear C, Murphy-Hill E, Parnin C, Stallings J. 2016. Gender differences and bias in open source: Pull request acceptance of women versus men. PeerJ Preprints 4:e1733v2 https://doi.org/10.7287/peerj.preprints.1733v2

note This preprint is not peer-reviewed. You may wish to reference the subsequent peer-reviewed version of this article.

Author comment

This revision addresses community feedback, specifically and most substantially:

(1) controlling covariates using propensity score matching,

(2) providing an interpretation of whether the differences are meaningful,

(3) including raw data used in figures as part of the appendix,

(4) characterizing the authors' own biases,

(5) adding section examining "Are women focusing their efforts on fewer projects?",

(6) comparing GitHub developers that are on Google+ to those who are not,

(7) adding analysis of exclusively projects that are licensed as open source,

(8) addition of statistical tests and corrections for false discovery, as appropriate,

(9) replacing bar chart representation of pull request acceptance rate,

(10) characterized missing data, and

(11) adding threats of uncaptured covariates and developer aliases.

The paper has a slightly revised title (adding "differences and") and we have added an author, Jon Stallings, who has contributed substantially to the revision.

Additionally, we have revised our data analysis pipeline substantially to use R scripts that extract data from our database and produce latex macros that define numerical results. We believe this improves the reliability of our analysis. In doing so, we found and fixed errors the following errors in the prior version:

* Y-axis in Figure 2 was previously truncated and means and medians in caption were incorrect,

* Rounding errors and transposition of "files changed" and "commits" in "Are women making smaller changes?",

* Incorrect summation of "without reference" pull requests, and consequently the accompanying percentages, in "Are women making pull requests that are more needed?", and

* One programming language difference (.m) was previously incorrectly reported as statistically significant.

Sections

Additional Information

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Josh Terrell conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, performed the computation work, reviewed drafts of the paper.

Andrew Kofink conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, performed the computation work, reviewed drafts of the paper.

Justin Middleton conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, reviewed drafts of the paper.

Clarissa Rainear analyzed the data, wrote the paper, reviewed drafts of the paper.

Emerson Murphy-Hill conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, performed the computation work, reviewed drafts of the paper.

Chris Parnin conceived and designed the experiments, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Jon Stallings conceived and designed the experiments, analyzed the data, wrote the paper, reviewed drafts of the paper.

Ethics

The following information was supplied relating to ethical approvals (i.e., approving body and any reference numbers):

NCSU IRB approved under #6708.

Data Deposition

The following information was supplied regarding data availability:

Data sets from GHTorrent and Google+ are publicly available.

Funding

This material is based in part upon work supported by the National Science Foundation under grant number 1252995. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Feedback on other revisions

10 comments on version 1 - published February 2016

Gender differences and bias in open source: Pull request acceptance of women versus men

Author and article information

Abstract

Author comment

Sections

Additional Information

Competing Interests

Author Contributions

Ethics

Data Deposition

Funding

Feedback on other revisions

Add your feedback

Publish for free

Five new journals in Chemistry

Sections

Additional Information

Competing Interests

Author Contributions

Ethics

Data Deposition

Funding

Feedback on other revisions

Add your feedback

Top referrals unique visitors

Share this preprint

Metrics

Download article

Publish for free

Five new journals in Chemistry