Review History


All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

  • The initial submission of this article was received on May 5th, 2025 and was peer-reviewed by 3 reviewers and the Academic Editor.
  • The Academic Editor made their initial decision on June 17th, 2025.
  • The first revision was submitted on July 25th, 2025 and was reviewed by 3 reviewers and the Academic Editor.
  • A further revision was submitted on September 12th, 2025 and was reviewed by 2 reviewers and the Academic Editor.
  • The article was Accepted by the Academic Editor on October 8th, 2025.

Version 0.3 (accepted)

· Oct 8, 2025 · Academic Editor

Accept

The paper may be accepted.

[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Section Editor covering this Section #]

·

Basic reporting

The authors have responded well and solved my concerns; the paper is recommended for publication as it is.

Experimental design

It seems fine now

Validity of the findings

Findings seem valid

Reviewer 2 ·

Basic reporting

The manuscript has been thoroughly revised in accordance with the provided recommendations. It is now comprehensive, refined, and deemed appropriate for publication.

Experimental design

-

Validity of the findings

-

Version 0.2

· Aug 27, 2025 · Academic Editor

Minor Revisions

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

·

Basic reporting

The paper seems to be OK now. It can be published with minor changes.
Urdu is a resource-deprived language. Authors are requested to cite the following papers, which are related to Urdu text classification.

1. Deep sentiment analysis using CNN-LSTM architecture of English and Roman Urdu text shared in social media

2. Multi-class sentiment analysis of Urdu text using multilingual BERT

3. Urdu sentiment analysis with deep learning methods

4. Multi-label emotion classification of Urdu tweets

5. Empowering Urdu sentiment analysis: an attention-based stacked CNN-Bi-LSTM DNN with multilingual BERT

Experimental design

The research methodology seems fine now.

Validity of the findings

The results are fine now.

Reviewer 2 ·

Basic reporting

-

Experimental design

-

Validity of the findings

-

Additional comments

The author has made substantial improvements to the work in accordance with the previous suggestions, demonstrating careful attention to both content and structure. These revisions have notably enhanced the clarity and overall quality of the manuscript. However, to further strengthen the academic rigor of the work, I would like to recommend that the Discussion section be expanded to provide a more comprehensive analysis of the obtained results.

Specifically, the author is encouraged to elaborate on the following aspects:
- Strengths of the proposed approach or system, highlighting its innovations and practical benefits.
- Limitations or challenges encountered, including any performance bottlenecks, generalizability issues, or constraints observed during experimentation.
- Alternative interpretations or perspectives that may arise from the findings, including possible implications for future research or applications in related domains.

In addition, it would be valuable for the author to provide a more detailed analysis of the developed model. This could include:
- A discussion of the model's architectural design choices and their impact on performance.
- A comparison with baseline or state-of-the-art models, supported by quantitative metrics (e.g., accuracy, F1-score, computational efficiency).
- An exploration of why and how the model achieves the observed results, potentially using interpretability tools such as confusion matrices, feature importance, or visualization techniques like t-SNE.

Such in-depth discussion and critical analysis will not only demonstrate the author’s comprehensive understanding of the subject matter but also significantly enhance the scientific value and credibility of the work.

Reviewer 3 ·

Basic reporting

The responses provided are acceptable, and the manuscript has shown improvement compared to its initial version.

Experimental design

-

Validity of the findings

-

Additional comments

The responses provided are acceptable.

Version 0.1 (original submission)

· Jun 17, 2025 · Academic Editor

Major Revisions

**PeerJ Staff Note:** Please ensure that all review and editorial comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

**Language Note:** The review process has identified that the English language must be improved. PeerJ can provide language editing services - please contact us at [email protected] for pricing (be sure to provide your manuscript number and title). Alternatively, you should make your own arrangements to improve the language quality and provide details in your response letter. – PeerJ Staff

·

Basic reporting

1. The authors should improve the abstract, as it currently lacks essential components. It does not clearly present the problem statement, motivation, or the key contributions of the study. Additionally, the abstract should briefly summarize the achieved results to provide a complete overview of the paper.

2. The introduction section also requires significant enhancement. It does not adequately define a specific problem statement, nor does it justify the selection of the convolutional Bi-LSTM model. The authors should explain why this particular architecture was chosen and what makes it effective for the given task.

3. The contribution of the paper appears to be limited or is not well-articulated. The authors should clearly highlight their novel contributions and elaborate on how their work advances the existing body of research.

4. Relevant articles should be reviewed and cited to enhance the related work section

Experimental design

The authors should provide detailed information about the hyperparameters used in the experiments. This is essential for ensuring reproducibility and for allowing other researchers to validate and build upon the results.

The explanation and discussion of the results need further elaboration. A more in-depth analysis is necessary to highlight the significance and implications of the findings.

The authors are encouraged to include a confusion matrix in the results section. This will offer a clearer view of model performance across different classes and help identify potential areas of misclassification.

The authors should also discuss the results presented in Table 2 in more detail, particularly explaining why traditional machine learning models achieved lower accuracy compared to deep learning approaches. An analysis of these performance differences would strengthen the paper’s contribution and help readers understand the effectiveness of deep learning methods in this context.

Validity of the findings

The authors are requested to provide examples of Roman Urdu hate speech along with their English translations. This will help readers clearly understand the linguistic characteristics of Roman Urdu and how it differs from standard English. Additionally, the authors are encouraged to evaluate their proposed model on an additional dataset to further validate the model’s robustness and generalizability.

Reviewer 2 ·

Basic reporting

The study introduces a novel approach for detecting hate speech in Roman Urdu by employing a deep hybrid neural network architecture based on convolutional and bidirectional long short-term memory (BiLSTM) layers. The proposed model is designed to capture both global and local textual features by combining convolutional layers with a bidirectional long short-term memory (BiLSTM) network, thereby enhancing the accuracy of hate speech detection. For comparative analysis, a traditional machine learning model was also implemented, against which the deep learning model exhibited superior performance in terms of both accuracy and F1-score. The researchers trained the model using a Roman Urdu text dataset, and the results demonstrated the model's effectiveness. Overall, the findings suggest that the performance metrics of the proposed method are comparable to those of more complex and recently developed architectures. The research is interesting. Here are some suggestions that could further enhance and complete the study.
1. While the introduction mentions the increasing prevalence of hate speech and the limitations of manual moderation, it does not clearly articulate a specific research gap that the current study addresses. The transition from general hate speech detection challenges to the particular problem of Roman Urdu processing is abrupt and underdeveloped. The novelty of detecting hate speech in Roman Urdu is implied. Still, the uniqueness and challenges of this task are not emphasized early or thoroughly enough to highlight the study's contribution to the existing literature.
2. The contribution list (lines 85–93) is not clearly separated from the rest of the introduction, and some points are vague or poorly structured: E.g., "The primary involvement of this paper..." should be revised to "The main contributions of this study are..."

Experimental design

1. The study does not explain why CNN-BiLSTM is the optimal choice over other hybrid or standalone models (e.g., GRU, Transformer-based models like BERT). There is no ablation study or comparative justification to support the selection of two separate convolution layers with different filter sizes or the specific BiLSTM configuration (128 and 64 nodes).

Validity of the findings

1. The results presented in Table 2 provide clear evidence that the proposed CNN-BiLSTM model outperforms all baseline models—including classical machine learning algorithms (e.g., SVM, Logistic Regression, Random Forest) and deep learning architectures (e.g., CNN, BiLSTM, BiGRU, and BERT-CNN-gram)—across all key evaluation metrics: accuracy, precision, recall, and F1-score. The model achieves the highest F1-score of 81.47%, affirming its robustness in the context of hate speech detection in Roman Urdu. Given the dataset's imbalanced nature, the use of F1-score as the primary evaluation metric further substantiates this conclusion. The comprehensive model comparison and consistent tuning across all models strengthen the reliability of the findings. While the results are compelling within the current experimental setup, further evaluations on diverse datasets and in real-world settings would be beneficial to confirm the model's generalizability.
2. Although CNN-BiLSTM performs better, inference speed, training time, or resource efficiency are not reported—essential aspects for real-time deployment.

Additional comments

1. In conclusion, the authors' claims about multiple datasets lack evidence. Adjusted to clarify focus on a single dataset.
2. Informal or awkward phrasing. Rewritten to match the formal academic tone.

Reviewer 3 ·

Basic reporting

Several citations are outdated (2014-2018). Updating references to include 2023-2025 publications would demonstrate awareness of current trends in Detecting hate speech in Roman Urdu.

Please improve the quality of all figures.

The contribution of the proposed approach is not clearly defined, as it appears to rely on an existing model and a publicly available dataset. This raises questions about the originality and significance of the work, both in terms of the data used and the model applied. Please justify.

Fix grammar, spelling, and formatting to improve readability.

Experimental design

The paper focuses on performance (accuracy, F1-score) but does not include any explainability analysis (feature importance).

A thorough analysis of the proposed model’s computational aspects, including time complexity, space complexity, and runtime performance, is essential. This analysis is crucial for evaluating the method’s efficiency and practical applicability.

I recommend improving the CNN-BiLSTM model through parameter optimization (Convolutional Neural Network with Hyperparameter Tuning)

Validity of the findings

Please include a table listing all input features for the proposed model. Additionally, provide statistical information for each feature to give readers a better understanding of the data used in the study.

Explain how the method performs on a mixture of two or more languages.

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.