Insights into the PeerJ Preprint: “Gender bias in open source”

by Peter Binfield | Mar 8, 2016 | Interviews, Preprints, Press

Last month, PeerJ Preprints published “Gender bias in open source: Pull request acceptance of women versus men”. That preprint was quickly picked up, and went on to receive a huge amount of feedback in venues as diverse as twitter, Reddit, Slashdot, Facebook, news media, comments on the article itself, and so on (see a selection of the venues, as noted on the article). In fact, at the time of writing, the Altmetric service records this preprint as having the 5th highest ‘altmetric score’ of all preprints, over all time.

We thought it would be interesting to get more of the ‘backstory’ behind this preprint, and so we approached Professor Emerson Murphy-Hill, of North Carolina State University to get his responses to some questions.

PJ: Can you explain how the idea for this research came about?

Emmerson writes: “Peter Rigby (Assistant Prof of Software Engineering at Concordia Univ) and myself (Emerson Murphy-Hill, Associate Professor of Computer Science, at NCSU) were chatting about gender bias at a Dagstuhl seminar, and we came up with the idea of measuring pull request acceptance as a function of gender. We were inspired by the recent paper “Gender and tenure diversity in GitHub teams“ and past literature that asks hiring managers to rate identical job applicants with male or female names. While Peter eventually discontinued his involvement, we drew in Chris Parnin (an Assistant Professor at NCSU) who was already doing gender studies in software engineering.

PJ: And how did you go about conducting the research?

The bulk of the work was done by Josh Terrell while he was working as a undergraduate research assistant last summer. Since then, Andrew Kofink and Clarissa Rainear have also contributed substantially as undergraduate research assistants on this project. Justin Middleton and Denae Ford are two grad students who have volunteered their time with the project as well.

PJ: Some of your methodology is fascinating (for example identifying gender by matching users to Google+ profiles), can you talk a bit more about that?

The idea to link GitHub data to Google+ data was something we came up with part way through the project. It’s one of our favorite parts of the project in that it’s a tremendously enabling technique, but also one of the things that makes us the most uneasy. While we’re using the data for what we think is a good cause (enhancing scientific understanding of gender issues), it could conceivably be used for questionable purposes as well.

PJ: You chose to preprint this article, before finalizing a version for peer-review, can you explain why you did that?

There were several reasons, but a big one is that we wanted to collect feedback from the software developer community to improve the work. Often, once work is published, the authors have moved on and are not as open to outside criticism.

PJ: And how was your experience with PeerJ Preprints?

Good! It gets a lot of things right. It makes it quite clear, in big red letters on the PeerJ page and the document itself, that the work is not yet peer reviewed. It appears some in the media didn’t get that message, but I don’t know how to make that any more obvious.

One thing that didn’t seem to work as well was the “suggestions” section on PeerJ for the paper. Many of the suggestions were so long or troll-ish that we couldn’t do anything directly actionable with them. In contrast, the “question and answer” actually had more actionable suggestions. The difference may have something to do with how scientific discussions are done traditionally; at a scientific conference, when you see work that you’re dubious of, you don’t say “BLARG ITS WRONG YOU SHOULDA DONE IT THIS WAY!!!!”, you instead say, “why did you decide to do it this way, rather than that way?” In other words, good suggestions usually start as good questions.

As a design suggestion, you might do away with the suggestions section and instead allow authors to award “good suggestion” badges to helpful questions. (Ed. Note: although not as nuanced as Stack Overflow’s ‘badges’, our system does allow authors to ‘Accept’ answers, as well as allowing any “subject experts” (e.g. published authors) to vote them “Up” or “Down”).

PJ: The preprint got a lot of attention, almost immediately – what is your take on why / how that happened?

While we expected raucous discussion on Slashdot and Hacker News, we were actually surprised by the media attention. Before release, Emerson did talk to the NCSU’s Public Relations Specialist, Matt Shipman, and his prediction was the media wouldn’t cover it because it wasn’t peer reviewed. In fact, while we usually do press releases for our work that we think is of broad interest, we didn’t in this case because it wasn’t peer reviewed. We did promote the paper through Facebook and Twitter posts, but I don’t think that explains how it got on the BBC or the Guardian.

PJ: Over the span of multiple news stories, discussion forums, blog posts, tweets and comments on the article site, you have probably received thousands of comments / items of feedback. How can you possibly digest and integrate this amount of feedback?

It’s not easy, especially since comments are spread across a variety of platforms. Sites like Reddit and Slashdot have made it a bit easier since we can focus on upvoted content.

What we have done is that each member of our research group took part of the task of looking at comments from different sources. We then discussed the comments, distilled down the ones we thought had merit, and made “issues” out of each one for us to address in our next draft.

PB: What has been your take on the feedback? How much of it has been useful for you?

There’s a lot of noise, but from it, we got some really good suggestions. For example, in one exchange, we got some great advice on how to improve our figures. This kind of back-and-forth was probably not possible in traditional peer review.

PJ: This article touches on sensitive issues of gender and bias in the workplace. Did you consider adding authors from other ‘spheres’ (e.g. sociology) to help analyze the data or design your experiment?

We’ve had some consultation with outside experts (for example, we ran the paper by a sociologist before posting it), but not to the level of authorship. Actually, I think the attention will help us convince others that the work is important and worth collaborating on. At the moment, we’re working with a statistician to help tease apart confounding factors.

PJ: Some commentators have observed that the media have uncritically taken snippets of your research and propagated a ‘headline’ which a) may not be accurate or b) may yet change subsequent to future revisions and peer review. They argue that as a result, a certain ‘conclusion’ is now in the public mindset and that is unlikely to be ‘over written’ with any subsequent iterations. What are your thoughts on this?

That’s a danger. On the issue of reporting accuracy, there’s an interesting conundrum here. We wrote the paper in such a way that we tried to make as accessible as public as we could, including pushing a lot of the methodological details to the appendix. We’ve taken some flak for that, insofar as the media perhaps didn’t take into account the limitations of the methodology, perhaps because they didn’t read it.

PJ: And what are the most important changes you might be making to the paper in light of the feedback?

There are two big ones. The first is accounting for all confounding factors simultaneously; we’re looking at using regression modeling and propensity score matching. The second — and this might be subsumed by the first — is providing some follow up analyses that can explain some unexpected differences in pull acceptance rates of men and women for gendered and ungendered profiles.

PJ: Overall, how has your experience been when preprinting this article? Would you do it again?

It’s been quite a ride so far, but ask us again in six months. It will be interesting to see how the peer review process goes, considering the importance of the topic and the media coverage.

PJ: And do you intend to issue any updated revisions of this preprint?

Yep, you bet!

PJ: When will the world get to see the final paper?

Hopefully this summer!

PJ: We thank Emerson and his co-authors for their insights into this preprint, and we encourage our readers to check out other preprints at our site and to consider submitting preprints of their own!

Get PeerJ Article Alerts