All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Thank you very much for your revisions and responses. I am delighted to accept your manuscript.
However, please note that "peerj-98155-Table_2_revised.docx" still includes percentages for prefer not to answer responses (1% for both discipline and experience, and 12% for gender). Given your response and the revised legend for this table, I suspect that the revised table wasn't the one uploaded (despite the name). This should be simple to correct.
**PeerJ Staff Note:** Although the Academic Editor is happy to accept your article as being scientifically sound, a final check of the manuscript shows that it would benefit from further editing. Therefore, please identify necessary edits and address these while in proof stage.
Thank you for your revised manuscript and rebuttal. As you can see, our remaining reviewer is now happy with the manuscript, as am I in terms of the research. However, I have found a number of generally minor points (more than could be left for copy editing) that need to be addressed before I accept the manuscript. I anticipate being able to do so once these are resolved. Where these are points of style, please feel free to accept or reject those suggestions as you prefer.
Line 27: In the abstract, it would be useful to let the reader know for absolute certain that this is a 95% CI, e.g., “the majority (90%, 95% CI [87%, 92%])” for the first use, as you do on Line 258 in the body of the manuscript. I wouldn’t repeat this additional information on Line 36.
Line 92: A spurious “for” in “and commonly rely FOR on journal-based proxies to do this.”
Lines 105, 108, 111, 180, 181, 185, 186, 188, 190, 191: This is presumably a missing font on my computer, but the bullets were missing when I looked at the PDF. The same applied to the possessive apostrophe on Line 317. If these are all clear in your copy of the PDF, this might be a point to leave for copy editing.
Lines 161–162: The three semi-colons separating list items could be replaced with commas as none of the list items themselves included embedded commas. I would suggest an “and” before the final item as you do on Line 257, for example.
Line 183: The comma after the date seems unnecessary.
Line 196: The “stratified” part of this seems unnecessary. A random sample would, on average, be representative, without any stratification, although stratification would improve this aspect.
Line 197: A spurious “that” in “and THAT conclusions from findings”.
Line 212: There are some spurious spaces in “the response” (these are separate spaces and not a single long character in my PDF).
Line 235: You could note in the Discussion or even here that many of these 177k potential recipients will likely not have met the inclusion criteria, particularly having served on a relevant committee in the past two years, as well as including undelivered emails, emails caught in spam filters, emails sent to unmonitored email addresses, etc. Your actual response rate from eligible recipients will likely be well above 0.21%, which is more of a lower bound.
Line 242: There is a spurious comma in “credibility,” here.
Line 251: The are spurious spaces in “13.9 %” (again, these appear to be multiple characters in my PDF).
Lines 264–268: I wonder if more specific qualifiers might be useful around here. For example, 56% and 51% could perhaps be described as “slight more than half”. This is just a suggestion and one that you could apply more widely, even in a result section, if you wanted
Line 273: The “of” in “to the majority (over 50% of) participants.” should come after the closing parenthesis.
Line 279: I think the comma in “We also asked about credibility-related goals, to understand” is unnecessarily.
Line 313: You normally include Oxford commas, but not in “such as shared data, code and peer reviews.” Also Lines 345, 387, 413.
Line 314: Spurious spaces in “extrinsic to the research output including” (multiple characters are evident in my PDF).
Line 318: There is a large gap for me within “extrinsic proxies”, which seems to be a single character.
Lines 320–321: The possessive apostrophe after “The majority of participant” is missing and has led to the possessive ‘s’ appearing on the following line.
Line 326: I wonder if the “do” at the end of this line could be removed to make the text consistent with the following percentage. Same on Lines 460 c.f. 461.
Line 359: I wonder about the phrase “research assessment committees” as this could be read to also include ethics committees, for example. Also Lines 364, 376, 431–432, and 554 (and possibly elsewhere). The description on, for example, Line 23 seems clearer/more specific as does the parenthetical clarification on Lines 530–531. In other instances, e.g. Line 554, the broader term seems highly appropriate to me.
Line 434: Spurious space in the references in “[6,24 ,25,28]”.
Lines 453–454: Single quotes were used here but generally double quotes have been used. Check consistency here. Also Lines 535–536.
Line 467: Could start line with “but” or “however”?
Line 572: I wonder if the text under “Supplementary Tables and Figures” could be made easier to read. The same might also apply to the shorter “Data availability” section just above.
Table 2: I suggest taking prefer not to answer responses out of the denominator (so that it doesn’t share in the percentage) as these are not really interpretable in my view. Given the numbers this won’t meaningfully change the actual responses in terms of percentages except for gender, where the man, women, other/non-binary responses would be easier to interpret for me as 69%, 31%, and <1% respectively with no percentage next to PNA.
Table 3: The last column’s heading should just be “P” or “P-value”.
I applaud the authors' efforts to edit the manuscript and have no further comment.
No comment.
No comment.
No comment.
We have received comments from our two reviewers. The second reviewer is satisfied with your revisions, but the first reviewer has raised some important points that warrant consideration and response.
The 50% point is worth attention as while “We used the definition of “most researchers” being >50% simply to signify a majority” is tautologically true, you could also likely justify other values here. Is there a meaningful difference between 51% (‘most’), 75% (‘a clear majority’), and 99% (‘almost all’), say? (I would argue strongly in favour of ‘yes’ here.) If you just mean ‘most’ as a majority, it doesn’t seem to warrant definition. If I had a concern about the way 50% is used here, it would be that sampling variability will produce values above this threshold regularly for samples given percentages in the population close to but a little below this level (and you appear to be interested in making inference about the population rather than only describing your sample percentages in your hypotheses on Lines 102–110). To address this, you could consider adding 95% CIs to Table 4 and Table 5 (note that Table 5 is miscaptioned as Table 4) and the supplementary Tables, 95% CI bars to Figures 1–4, and report these CIs in the text where you refer to these percentages. You could still define ‘most’ (or another label) in terms of the lower CI limit, which would likely leave some outcomes with percentages close to the threshold uncertain in terms of interpretation (e.g., a sample percentage of 51% doesn't provide convincing evidence that the population percentage is >50%). You already report CIs for the correlation coefficients in Table 3. Note that I’m not convinced that readers will find the degrees of freedom in Table 3 useful and I think you could improve the layout of Table 3 by putting the CIs into a single column.
For the PCA, while I appreciate supplementary figure 1, I wonder if a table of loadings for the components (instead of or as well as) would be more familiar, and so more readable, for some readers?
I agree with Reviewer One that you could look at further analyses using your interesting data, but at the same time, your manuscript is already not short and making full use of your data would likely produce an overly long manuscript. Please consider Reviewer One’s comments alongside mine above to see if you can make a little more use of these data without making the manuscript 'too long' in your view.
As one more comment, on Line 203, you mention ‘likert’. This makes an additional, generally untested, assumption of equi-spaced on top of ordinal. In my view ‘ordinal’ avoids having to justify this assumption.
While I appreciate the authors’ efforts to revise the paper, I am not sure if the efforts are enough to address the comments raised by both reviewers. I am hoping the comments will be helpful for the authors to consider the next step of the draft and project.
First, about using 50% as threshold, I don’t believe these are meaningful assumptions at all. Normally, we want to build assumptions based on empirical findings from previous research. In this case, why do the authors believe there should be more than 50% of researchers evaluating the credibility at all? Where does this belief come from? Why not using a different threshold? But a bigger question is, the majority of results in this manuscript are very descriptive, and the authors didn’t do any assumption testing later. So there is not a good point of having these assumptions at all. So this information is very confusing on multiple levels.
Second, for the results, the authors argued that PCA supported their findings, which is presented in a new Figure 1. But the Figure 1 is still the bar chart that has nothing to do with PCA results at all. And honestly, I feel the authors’ interpretation of the PCA results to be VERY insufficient. While it’s fine to identify which factor is the most important in explaining the outcome or variance, but I think the more interesting part of using PCA is to understand how the factors (presented in various figures) are connected or disconnected to each other. This is important simple because the results presented in the manuscript seem to be too simple and one-dimensional. I was hoping by using PCA or other similar methods, the authors could add more depth to the results. But the decision made by the authors is still not sufficient in my personal opinion.
Again, I am having no issue with the design of the survey, but by looking at this new draft, I am having the feeling that the analysis of the survey data is probably too simple and not meaningful enough for a journal publication.
Findings are valid but too simple.
No comment.
No comment.
No comment.
I would like to thank the authors for the careful revision of their article. I recommend to accept the article for publication in PeerJ.
As you can see, two reviewers have commented on your interesting work. Reviewer #1 has made several useful suggestions that I think you should incorporate or rebut. All of their comments are important. I wondered why you hadn’t gone further than bivariate correlations for the importance ratings when factor analysis (or PCA) could have been used. CFA would help if you wish to make statements like “confirms that most of our participants see these as separate concepts, i.e. credibility is not conferred by perceived impact.” (Lines 232–233). I'm not sure that I would interpret r=0.15 in quite that way. You should only perform statistical analyses where these assist in answering your research questions, but there did seem many additional opportunities for making the most of this data set, which you may well have planned for a future manuscript? Reviewer #2 has raised some interesting questions about impact factors and peer review as proxies for quality. These are reasonable questions for any reader to ask so I suggest either making changes here or pre-empting those questions. I strongly agree with their advice of presenting CIs alongside point estimates for all tests/models (I would suggest CI alongside p-values rather than instead of).
For your data analysis methods (starting Line 207), please ensure that all statistical methods used are described here. Could you also explain here how you chose Pearson’s correlations (what model checks were performed?) When saying you used a Bonferroni correction, please make sure that the reader will understand the denominator (so could themselves easily and validly calculate the adjusted level of significance in any case). Was this done at the table level, for example? Did this also apply to the correlations (I don’t think it does)?
I look forward to seeing, in due course, a manuscript with your changes marked up along with a point-by-point response to each of the reviewers’ and my points, indicating either how the manuscript has been changed or why no changes were made.
[# PeerJ Staff Note: It is PeerJ policy that additional references suggested during the peer-review process should *only* be included if the authors are in agreement that they are relevant and useful #]
General Impression:
This manuscript presents intriguing and significant research on how biological researchers evaluate research outputs in grant review and hiring committees. While I find the study valuable, there are several areas where improvement is needed for the next review stage.
1. Overly Detailed Background Information:
The manuscript extensively references previous research, particularly in the hypotheses and confirmation of prior findings. Specifically, the section between lines 93-104 includes details that seem tangential to the study’s core objectives. Streamlining this content, especially in the Methods section, would enhance clarity and relevance. Additionally, the rationale behind using a 50% threshold in the hypotheses appears arbitrary and lacks justification.
2. Research Design Limitations:
The research design primarily focuses on descriptive analysis based on a single survey question, which might limit the depth of the findings. The manuscript could benefit from exploring interrelations between different survey questions to address more complex research questions and enhance the robustness of the conclusions.
3. Analysis of Credibility Definition:
The initial results section on the definition of credibility could be strengthened by deeper data analysis. Techniques such as Principal Component Analysis (PCA) could elucidate similarities among various credibility criteria. Moreover, the results presentation needs improvement; currently, it only displays two ambiguous numbers. More comprehensive data presentation could significantly enhance the interpretative value of the results.
4. Redundancy and Citation in Result Reporting:
The repetitive mention of the statistical analysis tool ("All statistical analysis was conducted in R using the expss package") in every subsection is unnecessary. Since this information is introduced earlier in the manuscript, subsequent mentions could be omitted to avoid redundancy. Additionally, to adhere to academic standards, please ensure that both R and any packages used are properly cited using the citation() function within R. This will enhance the visibility and credibility of the software used in your research.
While the design of the questionnaire makes sense, there are some issues in experimental design that are discussed in my general comment.
Most of the findings are valid even though the paper will benefit from deep analysis of the results.
See my online review report: https://ludowaltman.pubpub.org/pub/review-credibility-assessment/release/1.
See my online review report: https://ludowaltman.pubpub.org/pub/review-credibility-assessment/release/1.
See my online review report: https://ludowaltman.pubpub.org/pub/review-credibility-assessment/release/1.
This paper presents the results of a survey studying how biology researchers assess credibility when they serve on grant and hiring committees. The paper is well written and the research is well done. I enjoyed reading the paper. I have a few minor comments.
Table 1 distinguishes between appropriate and inappropriate proxies for assessing the credibility of research. I am not convinced by the way the authors make this distinction. I think there is a need for more nuance. (Unfortunately, the research assessment reform movement sometimes fails to be sufficiently nuanced in its criticism on traditional assessment practices.)
While overreliance on journal reputation causes lots of problems, this doesn’t mean journal reputation is always an inappropriate proxy for assessing credibility. If a journal is known to systematically perform rigorous peer review, the use of this information to assess the credibility of an article in the journal makes sense. I don’t consider this to be bad practice. Likewise, while overreliance on journal impact factors is a big problem in some assessment systems, it is not clear whether the use of journal impact factors should always be rejected (e.g., see the argument I presented in https://doi.org/10.12688/f1000research.23418.2).
Conversely, the authors consider ‘Confirm output is peer reviewed’ to be an appropriate signal, but I would argue this signal is actually less informative than the reputation of a journal, because reputation takes into account not only whether a journal performs peer review, but also the level of rigor of the peer review process. So if reputation is considered to be an inappropriate signal, then ‘Confirm output is peer reviewed’ should also be considered inappropriate.
In the data analysis section, the authors explain that “comparisons of segment response proportions were conducted using the expss package in R using a z-test with Bonferroni correction, with a significance level set at p<0.05”. It is not clear to me which comparisons the authors are referring to. I cannot find these comparisons in the results presented in the paper. In general, my advice would be to use confidence intervals instead of significance tests.
In the section ‘How researchers define credibility’, I would be interested to see a table showing all correlation coefficients. Also, I find the p-values reported by the authors unhelpful, since they test the hypothesis that the importance ratings are uncorrelated, and this hypothesis is rather unrealistic. If the authors wish to present inferential statistics, I would recommend reporting confidence intervals.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.