This is a very interesting study and I really like the idea of using GitHub data. I have a question regarding the effect of gendered vs gender-neutral avatars. One of your conclusions is that female outsiders experience gender bias. However, this conclusion seems to rest on the assumption that female coders with gendered avatars are as competent as female coders with gender neutral avatars. Did you somehow test this assumption? This is really important because it is possible that outsiders with gendered avatars are simply less competent and in this case the lower acceptance rates would simply reflect that objectively.
I think your data gives us reason to believe that this alternative explanation is correct. You say (page 16): “Women have lower acceptance rates as outsiders when they are identifiable as women.” However, Fig. 5 shows that the same thing is true for men. Outsider men with gendered avatar had lower acceptance rates than outsider men with gender-neutral avatars. So gendered avatars had a negative effect on acceptance regardless of sex. Now, you could argue that the effect of avatar was bigger in women than in men and that this reflects gender bias but given your confidence intervals I doubt that this difference is significant.
I think there is also another problem with your logic. When we look at the data for insiders, we see that the avatar had no effect for women but it did have an effect for men. Applying your logic, we'd have to conclude that there is gender bias against insider men. And this evidence is actually stronger than the evidence that you provide for anti-female gender bias because insider women did not show any effect of avatar, only men did.
I would be interested to hear your thoughts on that.