Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on January 31st, 2025 and was peer-reviewed by 3 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on July 14th, 2025.
  • The first revision was submitted on September 25th, 2025 and was reviewed by 2 reviewers and the Academic Editor.
  • A further revision was submitted on October 21st, 2025 and was reviewed by 1 reviewer and the Academic Editor.
  • The article was Accepted by the Academic Editor on November 5th, 2025.

Version 0.3 (accepted)

· · Academic Editor

Accept

Dear authors, we are pleased to verify that you meet the reviewer's valuable feedback to improve your research.

Thank you for considering PeerJ Computer Science and submitting your work.

Kind regards
PCoelho

[# PeerJ Staff Note - this decision was reviewed and approved by Massimiliano Fasi, a PeerJ Section Editor covering this Section #]

Reviewer 3 ·

Basic reporting

The article meets the PeerJ criteria and should be accepted as is.

Experimental design

The article meets the PeerJ criteria and should be accepted as is.

Validity of the findings

The article meets the PeerJ criteria and should be accepted as is.

Additional comments

The article meets the PeerJ criteria and should be accepted as is.

Version 0.2

· · Academic Editor

Minor Revisions

Dear author,

Thanks a lot for your efforts to improve the manuscript.

Nevertheless, some concerns are still remaining that need to be addressed, mainly due to language inconsistencies. I recommend a careful check of English and (grammar) syntax during the proofreading step.

Like before, you are advised to critically respond to the remaining comments point by point when preparing a new version of the manuscript and while preparing for the rebuttal letter.

Kind regards,
PCoelho

·

Basic reporting

The text was clear and professional throughout.

Line 306/307: The quotation marks are inconsistent
Line 313: "hierarchical navigable small-world" to "HNSW"
Line 315: "on the" to "in"

Experimental design

The experimental design was original and solved the problem of identifying useful CRISPR-avoidant phages for phage therapies.

Validity of the findings

The impact and novelty were well discussed throughout.

Additional comments

The manuscript has been massively improved compared to the original submission, credit to the author.

Reviewer 3 ·

Basic reporting

I would like to thank the authors for addressing my comments and suggestions. Overall, they improved the manuscript a lot.

Experimental design

The experimental design is much clearer.

Validity of the findings

The article meets journal standards: all underlying data and code have been provided; conclusions are well stated, linked to the original research question, and limited to supporting results.

Additional comments

Congrats to the authors for having developed this tool and for having improved the quality of the manuscript.

Version 0.1 (original submission)

· · Academic Editor

Major Revisions

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

**Language Note:** The review process has identified that the English language must be improved. PeerJ can provide language editing services - please contact us at [email protected] for pricing (be sure to provide your manuscript number and title). Alternatively, you should make your own arrangements to improve the language quality and provide details in your response letter. – PeerJ Staff

Reviewer 1 ·

Basic reporting

Satisfactory. It includes all definitions, terms, and details for understanding the mathematical approach used in the research process. The reference list is updated.

Experimental design

-

Validity of the findings

The validity of findings depends on a proof-of-concept study that involves rigorous testing and adherence to ethical guidelines for human subjects.

Additional comments

Phage therapy depends on the CRISPR-Cas systems that recognize and cut the DNA of invading phages based on genetic information (protospacers) into the host genome. The report by Bogdan Kirillov presents a bioinformatics approach (VDPhage algorithm) that allows fast selection of the phages that are the hardest to suppress (resistant) to the CRISPR system. This method is based on a vector database made from protospacers that were found in phage genomes using tools like CRISPRCasTyper, CRISPRloci, or CRISPRCasFinder. Several studies have explored the relationship between the CRISPR-Cas system and antibiotic resistance, which can be carried by bacteriophages and plasmids carrying antibiotic resistance genes (ARGs). The designed algorithm could be used clinically to search for a phage cocktail capable of bypassing CRISPR defense mechanisms, thus increasing the efficacy of phage therapy and potentially reducing antibiotic resistance. Such an approach could also be useful for the phage-bacteria-specific recognition mechanism and facilitate tailored phage therapy from in vivo experimentation to clinical trials.

·

Basic reporting

The research article is well constructed with clear and concise English and a professional tone. Relevant results and hypotheses were logical and reported clearly. To improve the article’s clarity and context within the wider field of phage therapy research, I provide the following suggestions:

1. Line 44-48 This section is slightly misrepresentative and lacking in context. It is not just bacterial resistance; there are other factors, including the individuality of bacterial infection. Additionally, there is a co-evolutionary relationship between bacteria and phage that does not exist with antibiotics. For example, bacteria could evolve resistance to phage by mutating a receptor or antibiotics with an efflux pump. But, phage can overcome the bacteria’s evolved phage resistance as mentioned in this article, for instance, with the use of anti-CRISPR proteins. Finally, the author should also mention here the wider context of bacterial immunity. The author addresses one form, CRISPR-Cas; however, many bacterial defence systems have been recently described, and algorithm development for selecting resistant phage therapies will also be required.

2. I think generally it would aid the reader from diverse backgrounds to be provided a sentence somewhere within the article for each program used about what it does and how it achieves its function. For example, CRISPRCasFinder queries a bacterial genome searching for known PAM sites (i.e., trinucleotide sequences) and then identifies protospacer sequences.

It is also implicit in the text with reference to the Figure when a program is being used, but it would improve clarity if it were explicitly stated. For instance, Lines 115-117, I presume from Figure 1 that CRISPRCasFinder is being used, but it should be stated.

3. The author highlights that in vivo experiments are needed to fully evaluate the efficacy and safety of phage therapy. However, the experimental designs listed seem to go beyond the scope of this paper, and I would suggest combining lines 356-384 into a singular, concise paragraph to attempt streamlining this section’s contents for improved clarity and flow.

4. Lines 362-366, the biofilm experimental framework does not seem supported by the research presented, and it is not clear how CRISPR evasion relates to biofilms. I suggest removal.

5. Line 49-51, the author states, “This mechanism is crucial for bacterial survival and proliferation, yet currently the assessment of CRISPR immunity of the bacterial infection before attempting to design and use phage therapy is not a part of common practice.” It would be to the benefit of the reader to know here why (i.e., technical/knowledge gaps), to highlight the importance of this research article.

6. Lines 64-65, the author highlights how CRISPR can be chromosomally or plasmid encoded. For a broad reader base, especially in biologically related computer sciences, it would be helpful to explain that chromosomally encoded DNA is passed vertically only to daughter cells within the same species, whereas plasmid-encoded DNA may be transferred horizontally within a microbiome to bacteria outside the parent species. This is important because it slightly modifies the context of this paper, which only currently queries bacterial genomes and not necessarily species-associated plasmids that can carry CRISPR elements. This should also be addressed in the “Discussion” section of the article as a potential future direction.

7. Line 68-71 Anti-CRISPR proteins are also naturally occurring and do not necessarily need to be engineered in. This could also be mentioned in the “Discussion” section that future iterations of the algorithm could search for the worst possible phages for the prevalence of anti-defence systems and prioritise diversity of these systems in the final phage selected for a cocktail.

8. Lines 27-33. It may be beneficial to mention that with the invention of penicillin and its broad host-range effects, it was preferred over phage during World War II in treating injury due to its scalability relative to phage. Renewed interest in phage comes in the face of rising antimicrobial resistance and a lack of effective treatments; we have left to combat this issue.

9. Lines 23-37 could likely be summarised more concisely, freeing space for information, such as introducing current algorithms and methods in the field to better highlight the knowledge gap, or a brief introduction to bacterial immune systems to better situate this article within the research space.

10. Lines 39-41 can also mention shifting attitudes in the UK with the inquiry of “The antimicrobial potential of bacteriophage,” released January 2024, and the policy paper “The antimicrobial potential of bacteriophages: government response,” March 2024.

11. Line 126, the author could also mention the phage genome database, INPHARED (https://pmc.ncbi.nlm.nih.gov/articles/PMC9041510/), updated monthly.

12. Lines 340-341, the author suggests that the clinical use of this algorithm could potentially reduce antibiotic resistance. I think it would more correctly be stated along the lines, “This could increase the efficacy of phage therapy and reduce our reliance on antibiotics, thereby contributing to reductions in antibiotic resistance”.

13. Lines 303-305, the improvements in algorithm speed are impressive, though I think the comparison would be clearer in terms of equal extracted records (i.e., about 260 seconds for baseline versus 11 seconds for VDPhage).

14. Line 59, it should be highlighted that this is an adaptive immunity improving and changing with exposure, as compared to restriction modification systems, which are innate. For example, “CRISPR systems are a form of adaptive immunity consisting of…”.

15. Line 130-132, the author should elaborate on how the description of the protospacer-PAM combination is left for the end user. For example, do parameters include distance, size, nucleotide content, etc?

16. On line 145, the author states they decided to use CRISPRCasFinder among multiple potential computational solutions. It should be stated why it was chosen (e.g., speed, performance, reproducibility, etc).

17. Lines 319-320 the sentence is grammatically awkward, consider modifying “The hierarchical navigable small-world structure of the vector database allows to quickly identify and retrieve phage sequences that shared high similarity with the query” to “The hierarchical navigable small-world structure of the vector database allows rapid identification and retrieval of phage genomes…”.

18. Lines 263-264 for clarity modify “It supports both vector database method and local BLAST method and outputs” to “It supports both the vector database and local BLAST methods, outputting a sorted list of phages and accompanying…”.

19. Line 167, the author should introduce VDPhage here to increase the clarity of the upcoming paragraph. For example, “small-world structure, VDPhage”.

20. Line 188, the acronym HNSW for Hierarchical Navigable Small-World needs to be introduced somewhere (Line 167 is where it is first mentioned).

21. Line 104, it would be good to mention, the technology outperforms BLAST with respect to runtime and selection of relevant phages.

22. Line 126-127, state which string search tool was used.

23. Line 130 modify “target” to “target DNA sequence”.

24. Line 42 (NIH) is repeated as (nih).

25. Line 52: spelling: coctails to cocktail.

26. Line 58 “is an essential [first step]…”.

27. Line 136 “gatheriing" to “gathering”.

28. Line 206 grammatical correction, “the vector database”.

29. Line 307 spelling correction “Alienvare” to “Alienware”.

30. Table 1 and Lines 134, 235, 236, 238, and 311 grammar correction “C.difficile” to “C. difficile” ensure space.

31. Line 316 grammar correction: “In case” to “In the case” and terminology “phage sequences” to “phage genomes”.

32. Line 321 consistency “figure 3” to “Figure 3”.

33. Line 331 grammar correction: “In case” to “In the case”.

34. Lines 357, 367, and 374 formatting correction “in vivo” to be italicized.

Experimental design

The work presented falls within the “Aims and Scope” of the journal. The research question is well defined, relevant, and meaningful. Researchers lack the means to rapidly identify phage therapy candidates that can evade bacterial defences. This research presents an effective algorithm, which begins to fill that gap.

Validity of the findings

Findings and their implications are described throughout. However, I would suggest three improvements, the first of which is the most critical.

35. I strongly advise the author to include the perspective of identifying CRISPR defences in pathogenic bacteria, in addition to this technology’s intended use for selecting potentially therapeutic phage. With drastic improvements in genetic engineering and bottom-up phage genome construction methods, this perspective will ensure its relevance and impact with a much larger audience. By listing the location of the spacer within the bacterial genome, researchers could target these regions to disarm CRISPR-defences or construct entirely synthetic genomes lacking any CRISPR target sites. Importantly, it seems that your technology already accommodates this, but is not described as such.

36. Data used for the analysis of Figure 3 could be provided.

37. A conclusion statement about findings and implications should be included at the end of the “Discussion” section.

Additional comments

The article is of interest to researchers in computer science, bioinformatics, biosciences, and medicine, specifically working with phage, genomes, or antimicrobial resistance. I believe researchers will find the work of interest and useful in streamlining their work. The article should continue to improve the algorithm by validating its use in vivo and adding features, such as selecting phage genomes that include anti-defence systems or identifying defence systems in bacterial genomes.

Reviewer 3 ·

Basic reporting

The article "A Vector Database Solution for Rational Design of CRISPR Defense Avoidant Phage Therapy" presents an interesting approach to addressing the limitations of phage therapy due to CRISPR-Cas systems, and offers a solution based on vector databases, implemented as a workflow called VDPhage. I read it with great interest, and the idea sounds interesting and innovative. However, a major revision of the article is necessary to address the following concerns:

1. Introduction section. The current structure and presentation of the content require significant improvement to provide a clear and logical workflow for the reader. As it stands, the text is challenging to follow, and a reorganisation of the material is necessary to enhance clarity and readability:
1.1 acr proteins (line 56) are explained later in the text (lines 66-71).
1.2 Lines 66-71 refer to the biological mechanism, which was already introduced in lines 46-53.
1.3 Line 72 "Another option" is "another" with respect to what?
1.4 The description of the naive BLAST-based solution should be reduced, and some parts should be moved to the Discussion section. Indeed, its points of weakness are points of strength of the VDPhage workflow.
1.5 Lines "101-107" are statements for conclusions rather than part of an Introduction section.

2. Methodology section. The methodology section needs to be reorganised:
2.1 I recommend reordering the presentation to introduce VDPhage as the primary method, followed by the BLAST-based solution as a secondary or comparative approach. This revised structure would allow the manuscript to focus on the main subject, VDPhage, and provide a clearer and more logical workflow for the reader.
2.2 It lacks important references to be added to ensure reproducibility (FAISS, vicinity, Levenshtein distance, Kolmogorov-Smirnov statistics)
2.3 It contains typos (line 136, gathering; line 144-145 "CRISPRCasFinder" is repeated twice) to be carefully checked.
2.4 The phrase "In other words" (line 161) is not usually found in technical or formal writing, such as in this section.
2.5 Lines 173-177 should be moved to the Introduction.
2.6 Line 187 is "invented" the right term to use? Maybe "developed"?
2.7 I strongly suggest adding a new sub-section that provides a detailed explanation of the VDPhage workflow code, which is currently only available as Supplementary Material. This code is not mentioned or referenced in any part of the manuscript, making it unclear how it relates to the rest of the work. By including a dedicated section to describe the code and its functionality, the authors can provide a more comprehensive and transparent overview of their methodology, allowing readers to better understand and replicate the results.

3. Results section. It needs to be reorganised.
3.1 I recommend first presenting the results, and then the VDPhage commands and options.
3.2 Lines 289-290: "Please refer to the accompanying documentation for more detailed information and advanced usage options" is not a result.
3.3 The phrase "As you can see" (line 322) is not usually found in scientific writing.

4. Discussion section. It needs to be reorganised to integrate those paragraphs of the introduction and Methodology sections that I recommend moving (see above).
4.1 Lines 342: The sentence "a gold standard for CRISPR-avoidant phage therapy" is quite strong for a proof-of-concept study (defined as such in line 396).
4.2 You discussed about in vivo studies and animal models. What about an alternative to animal studies? Alternatives to animal studies encompass various methods, including in vitro testing (using cells and tissues outside the body) and in silico modelling (using computer simulations). These approaches aim to replace or reduce animal testing, offering potential advantages in terms of accuracy, relevance to humans, and ethical considerations (which are discussed in the text, thus it makes sense to comment on that.
4.3 I recommend to add in this section a paragraph about the limitations of the presented solution. One thing is for sure: the need for benchmarking.
4.4. Check that terms like "in vivo" and "in silico" are in italics.

Experimental design

See my comments on the methodology section. In addition, I strongly recommend adding at least another case to present, in addition to the C. difficile one. For example, an example on an AMR strain (Staphylococcus aureus?) would provide more robustness and, overall, interest to the manuscript.

Validity of the findings

-

Additional comments

1. A revision of the English is recommended.
2. The formatting of references seems not to be in line with the journal guidelines.
3. The VDPhage workflow should be made publicly available on repositories such as Zenodo, GitLab, or GitHub.

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.