Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

Review History
Multi-label classification for image tamper detection based on Swin-T segmentation network in the spatial domain

All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

The initial submission of this article was received on November 12th, 2024 and was peer-reviewed by 2 reviewers and the Academic Editor.
The Academic Editor made their initial decision on January 10th, 2025.
The first revision was submitted on February 19th, 2025 and was reviewed by 2 reviewers and the Academic Editor.
The article was Accepted by the Academic Editor on February 27th, 2025.

Version 0.2 (accepted)

Paulo Jorge Coelho · Feb 27, 2025 · Academic Editor

Accept

Dear authors, we are pleased to verify that you meet the reviewer's valuable feedback to improve your research.

Thank you for considering PeerJ Computer Science and submitting your work.

Kind regards
PCoelho

[# PeerJ Staff Note - this decision was reviewed and approved by Jyotismita Chaki, a PeerJ Section Editor covering this Section #]

Reviewer 1 · Feb 27, 2025

Basic reporting

The manuscript addresses multi-label classification of tampering types in document images using a Swin-T based segmentation network and several optimization strategies.

Experimental design

The proposed model incorporating the Swin-T backbone with spatial domain sensing module shows thoughtful design choices targeted specifically at the challenges of tamper detection. The methods section provides sufficient technical detail and parameter specifications to enable replication.

Validity of the findings

The findings are well supported by comprehensive experimental evidence. The ablation studies systematically demonstrate the contribution of each proposed component, with clear quantitative improvements documented for the spatial perception module, multi-resolution fusion, auxiliary detection head, and optimization strategies.

Cite this review as

Anonymous Reviewer (2025) Peer Review #1 of "Multi-label classification for image tamper detection based on Swin-T segmentation network in the spatial domain (v0.2)". PeerJ Computer Science

Reviewer 2 · Feb 20, 2025

Basic reporting

Manuscript improved a lot

Experimental design

Manuscript improved a lot

Validity of the findings

Manuscript improved a lot

Additional comments

Manuscript improved a lot

Cite this review as

Anonymous Reviewer (2025) Peer Review #2 of "Multi-label classification for image tamper detection based on Swin-T segmentation network in the spatial domain (v0.2)". PeerJ Computer Science

Download Version 0.2 (PDF) Download author's response letter - submitted Feb 19, 2025

Version 0.1 (original submission)

Paulo Jorge Coelho · Jan 10, 2025 · Academic Editor

Major Revisions

Dear authors,
You are advised to critically respond to all comments point by point when preparing an updated version of the manuscript and while preparing for the rebuttal letter. Please address all comments/suggestions provided by reviewers, considering that these should be added to the new version of the manuscript.

Kind regards,
PCoelho

Reviewer 1 · Dec 19, 2024

Basic reporting

The manuscript addresses multi-label classification of tampering types in document images using a Swin-T based segmentation network and several optimization strategies. While the idea is interesting, the paper contains several unclear points regarding the experimental design, the logic behind certain architectural choices, and the lack of certain analyses. Further clarity and additional experiments are needed to strengthen the methodology and support the conclusions.

1. The paper states that multiple tampering types can be accurately detected, but it is not clearly explained how the ground-truth masks were annotated for each forgery type or how the classification boundaries between these types were defined.
2. The description of the self-supervised augmentation process is insufficiently detailed, making it hard to understand the exact nature of newly generated mixed-type tampering samples and to confirm their realism.

Experimental design

3. The logic behind mixing multiple tampering operations in a single image is not well justified, and there is no experiment comparing performance on single-type vs. multi-type tampering scenarios to confirm that this mixing is beneficial.
4. The “hard example mining” step is introduced but not quantitatively analyzed; no experiments show how performance changes if hard example mining is omitted, making it unclear whether this component is truly necessary.
5. The selection criteria and potential biases for creating the MixTamper dataset remain unclear, and no external datasets are tested to confirm that the model’s improvements are not overly tailored to these particular training sets.

Validity of the findings

6. The paper reports strong results on two datasets, but the evaluation metrics for multi-class segmentation are not explained in detail, and it is not stated whether IoU and F1 are computed per-class and then averaged, or if some weighting strategy is applied.
7. The paper mentions that the model’s performance decreases significantly under compression attacks but does not provide any reasoning, ablation, or mitigation strategies, leaving the robustness claim partially unsupported.

Additional comments

8. The auxiliary detection head is claimed to improve convergence, but no ablation study is provided to confirm this claim, making the role of this component and its necessity in the final architecture unclear.

Cite this review as

Anonymous Reviewer (2025) Peer Review #1 of "Multi-label classification for image tamper detection based on Swin-T segmentation network in the spatial domain (v0.1)". PeerJ Computer Science

Reviewer 2 · Dec 30, 2024

Basic reporting

1. Highlight the unique aspects of your method in abstract.
2. Include a more detailed comparison with other Swin-T-based models to underscore the novelty (Introduction or related works).

Experimental design

3. Perform detailed analyses on varying parameter settings for hard sample mining and self-supervised augmentation techniques.
4. Extend robustness testing by simulating other real-world distortions, such as Gaussian noise, blurring, or varying resolutions.

Validity of the findings

5. Expand on the rationale behind using the Lovász-Softmax loss in conjunction with cross-entropy loss. Show how the combined loss impacts convergence and performance.
6. Ensure that the language is professional, clear, and concise throughout. Address any grammatical errors or typos.

Additional comments

7. Use consistent formatting for equations, figures, and tables for improved readability.

Please include all this improvement in your paper

Cite this review as

Anonymous Reviewer (2025) Peer Review #2 of "Multi-label classification for image tamper detection based on Swin-T segmentation network in the spatial domain (v0.1)". PeerJ Computer Science

Download Original Submission (PDF) - submitted Nov 12, 2024

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Review History Multi-label classification for image tamper detection based on Swin-T segmentation network in the spatial domain

Summary

Version 0.2 (accepted)

Paulo Jorge Coelho · Feb 27, 2025 · Academic Editor

Reviewer 1 · Feb 27, 2025

Basic reporting

Experimental design

Validity of the findings

Reviewer 2 · Feb 20, 2025

Basic reporting

Experimental design

Validity of the findings

Additional comments

Version 0.1 (original submission)

Paulo Jorge Coelho · Jan 10, 2025 · Academic Editor

Reviewer 1 · Dec 19, 2024

Basic reporting

Experimental design

Validity of the findings

Additional comments

Reviewer 2 · Dec 30, 2024

Basic reporting

Experimental design

Validity of the findings

Additional comments

Review History
Multi-label classification for image tamper detection based on Swin-T segmentation network in the spatial domain