All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Thank you very much for making the required changes.
[# PeerJ Staff Note - this decision was reviewed and approved by Elena Papaleo, a PeerJ Section Editor covering this Section #]
Thank you very much for making the requested revisions. There are only a few items left to address. Your manuscript needs one more round of careful proofreading.
[# PeerJ Staff Note: If you want, PeerJ can provide language editing services - please contact us at [email protected] for pricing (be sure to provide your manuscript number and title) #]
The authors have adequately addressed all my comments.
No comment
No comment
The authors have adequately addressed all my comments.
This is the only thing left to fix, I believe.
While the authors corrected the previously flag grammar, new grammatical errors have crept in the new sections:
Line 303: "Care should though be exerted since the intersection..."
Line 318: "Also as for top8000 we find virtually all the links of positive writhe still as top-hits of the unrestricted search"
The phrases "steal the picture" and "steals the picture" are used. The more common phrase is "steals the show", but it generally not used in scientific publications.
The lack of recent datasets was my main issue, and this has now been resolved.
I think the paper is much better now we this more comprehensive results, especially Figure 4!
I believe your paper is even better now than it was before. I am especially fond of Figure 4!
There are just some minor corrections to the grammar that are required, but other than that I believe it is ready to publish!
The reviewers found the paper to be interesting, however, several major issues were identified. First of all, literature review was found to be insufficient. Second, the databases that were used are decade(s) old. To make the paper more interesting to the reader, it is recommended to add more examples, more explanations aimed at a less technical audience. The details can be moved to a supplement.
Please carefully read the helpful comments from the 3 reviewers and provide point-by-point response to each raised concern.
[# PeerJ Staff Note: It is PeerJ policy that additional references suggested during the peer-review process should only be included if the authors are in agreement that they are relevant and useful #]
Some important references are missing (see general comments for the authors).
no comment
Please see general comments for the author
The authors use Gauss integrals to characterize the extent of pairwise entanglement between various subchains in protein chains and to find pairs of subchains with extreme levels of entanglement. The analysis is interesting and the presented cases of extreme pairwise entanglements are interesting and educative.
However, several previous studies of proteins using similar methods are not discussed and not even quoted. The authors do not mention that very similar approach to measure pairwise entanglement of protein subchains was used in recent studies by other authors. The papers that should be quoted and discussed in this respect include: 1. Exploring the correlation between the folding rates of proteins and the entanglement of their native states. J. Phys. A: Math. Theor. 50, 504001 (2017). 2. Linking in domain-swapped protein dimers. Sci. Rep. 6, 33872 (2016). 3. Sequence and structural patterns detected in entangled proteins reveal the importance of co-translational folding. Sci. Rep. 9, 8426 (2019).
The authors mention several times that their method permits local topological characterization of all subchains in protein structures. In that context, it would be good to mention the method of topological analysis of all subchains in proteins using the concept of knotoids (Sci. Rep. 7, 6309 (2017)). Since knotoids are open knots that method does not require closure of analysed subchains, just like it is the case for the calculation of Gauss integrals used by the authors.
All along the MS the authors discuss and show entanglements with extreme values of positive or negative mutual writhe. However, I did not see an explanation how the sign of crossings i.e. the orientation of analysed polygonal chains is determined. This explanation is needed especially for the Figures 2 and 3 where the N and C termini of shown subchains are not indicated.
In results and discussion it is mentioned that pairs of chains with highest positive mutual writhe are found within protein knots. It would be interesting for readers to know whether these regions correspond to borders of knots or to internal crossings. The authors also mention that pairs of chains with the highest negative writhe (and with the highest absolute writhe) are not from knotted proteins but from proteins forming double poke structures. Please, explain this asymmetry? In principle, double poke structures could have both signs. I presume that this asymmetry is caused by the handedness of alpha helices but this is only my guess. I would appreciate a more thorough explanation.
In many places the authors write background in two words “back ground”. Is this intentional? If yes, it needs to be explained.
This paper has a rather short list of references for a topic that has a decent amount of publications.
I believe that in the background section alone, there are some papers that should be cited:
https://www.ncbi.nlm.nih.gov/pubmed/17967433
https://www.ncbi.nlm.nih.gov/pubmed/17764691
as they add to the important implications that these topologies have on protein stability and folding.
I believe there should be a comma on line 55:
"global fold descriptors, an efficient method for computing them locally has"
My only concern with the experimental design is that most of the results are based off a dataset from 2011:
http://kinemage.biochem.duke.edu/databases/top8000.php
I would recommend using a data set like one for the PISCES server which is updated weekly: http://dunbrack.fccc.edu/PISCES.php
The top100 data set used is even older, from 1999: http://kinemage.biochem.duke.edu/databases/top100.php
I would also link to these above pages directly in the references, as they were not trivial to find based on the citation in the text of: http://kinemage.biochem.duke.edu
My only comment would be to generate findings for these methods using data sets that are more recent than the 8 and 20-year-old kinemage data sets.
I think these are very exciting results that should interest many people in the field.
I was going to suggest stereoscopic view for your figures, because it's very difficult for outsiders to view these tangled topologies, but providing those Python or Pymol scripts would suffice (as long as they are available for each figure).
I found Fig.S9 very hard to read due to the tiny font on the figure (such as trying to see the poke length for each). Also, I the Fig.S9 caption should start on the left (instead of on the right side).
While I realize this work is more of a proof-of-concept, and therefore it doesn't matter is you are running it on 50 year old pdb files or ones from the latest build of the PDB... because you just want to show that your method works, and the interesting results that come from it.
However, before Taylor's 2000 Nature paper (https://www.nature.com/articles/35022623), deeply-knotted proteins simply did not exist in our world... so just for this reason alone, because you DO want to show that your method works, I believe it is worthwhile to at least provide a small run on a more recent data set. Who knows... maybe some even more interesting results will come of it!
1. This paper describes important work and a valuable new tool that computes Gauss Integrals of all subchains of a protein configuration. However, the exposition does not do it justice, making it difficult (at least for this reader) to interpret the results presented in the paper.
2. English was clear, but would benefit from some copyediting. For example the sentence in lines 18–2 on page 1 (in the abstract) is a long, run-on sentence. Nonetheless, I can still understand the main idea of it. Other similar examples occur throughout the text.
3. Literature references are far from sufficient. The paper would benefit from the addition of many more references throughout. For example, besides the last author and their co-authors, other groups have used the Gauss integral to study proteins. In particular, see Panagioutou’s recent preprint https://arxiv.org/pdf/1812.08721.pdf and the references [19-22] and [3-4], [25] therein.
A second example is the reference to the KnotProt server [5], which only includes its web address. However, on the KnotProt website’s “how to cite” page, the instructions are as follows:
"When you publish results using the database, please cite the following papers:
• Dabrowski-Tumanski P, Rubach P, Goundaroulis D, Dorier J, Sułkowski P, Millett KC, Rawdon EJ, Stasiak A, Sulkowska JI, (2018) KnotProt 2.0: a database of proteins with knots and other entangled structures, Nucleic Acids Research, gky1140, https://doi.org/10.1093/nar/gky1140
• Jamroz M, Niemyska W, Rawdon EJ, Stasiak A, Millett KC, Sułkowski P, Sulkowska JI (2014) KnotProt: a database of proteins with knots and slipknots, Nucleic Acids Research, 43, D306-D314DOI: 10.1093/nar/gku1059
• Sulkowska JI, Rawdon EJ, Millett KC, Onuchic JN and Stasiak A (2012) Conservation of complex knotting and slipknotting patterns in proteins, Proc. Natl. Acad. Sci. U.S.A. 109, E1715–E1723, DOI: 10.1073/pnas.1205918109"
None of the above papers appear in the list of references, and should be included.
4. Background provided was insufficient. Although the paper is generally well-written, it is difficult for a non-expert to understand, without reading 6 other papers first. Some particular examples include:
⁃ the acronym RMSD, used in line 67 of page 2, is mentioned without a reference. What is it short for?
⁃ a “poke” and “co-poke” are first mentioned in line 78, but is defined only in lines 123-124. The reference [2] is given, but a figure would have been helpful for readers unfamiliar with them. In addition, it would have been helpful to include short discussion of how a poke differs from, for example, a slipknot, as given in [2].
⁃ Even though the Gauss integral and writhe is generally well-known in the community, the paper would be improved by including a definition of it. I was surprised that a definition did not appear, neither in the main paper nor in the supplement. More concerning was a lack of definition of how the Gauss Integral of the subchain is determined. I found it in the reference Rogen and Bohr ([12] in the main paper and [6] in the supplemental), which proved to be very instructive. Although the authors don’t need to repeat the entire exposition from Rogen and Bohr, some minimum amount of background should be provided. It doesn't matter whether it's in the main paper or in the supplement, but it should appear somewhere.
⁃ The paper discusses and compares results with KnotProt, e.g. in pages 9-10, without ever explaining first what it is that KnotProt produces. What does KnotProt measure? How does this compare/contrast with that produced by GISA? Again, it doesn't matter whether it's in the main paper or in the supplement, but it whould appear somewhere.
5. I found that the paper spent too much real estate talking about the programming, implementation, and algorithmic complexity of the computer program. Some examples include the long discussion of restricted vs unrestricted searches on pages 6–7, and much of the Discussion section pages 11-12. I felt that the information could have been conveyed in a more succinct fashion.
Instead, I would have preferred that the technical discussion be replaced with more explanation, accompanied by figures, about what it is that the computer program actually computes. The main issue is that when the authors explain the results from GISA, I found it very hard to understand what they are trying to say. What is the difference between a mutual writhe of 1.25 and 1.07? Why should I care? How could this information be used? Perhaps it is the fault of this reader, for not reading carefully enough, and that the answer appeared in the paper. But at the same time, I feel that the authors could also have been more pedagogic in their writing. Moreover, the situation is much compounded when they compare with other descriptions of protein configurations (e.g. KnotProt fingerprints). Although I do understand KnotProt fingerprints well, I still found that I was unsure to what aspect of the fingerprints I should compare the GISA output. This could easily have been remedied with a short paragraph describing Knot Prot fingerprints, and how they compare/contrast with GISA data.
Research question was well defined, relevant and meaningful. The information about the protein configurations should be useful to other researchers.
No comment. I'm afraid that I did not try to replicate their results, and did not download a copy of their program to use. I base my comments on what was reported in their paper.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.