Javascript is disabled in your browser. Please enable Javascript to view PeerJ.

Review History
Beyond words: a hybrid transformer-ensemble approach for detecting hate speech and offensive language on social media

All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.

Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.

View examples of open peer review.

Summary

The initial submission of this article was received on January 27th, 2025 and was peer-reviewed by 2 reviewers and the Academic Editor.
The Academic Editor made their initial decision on March 28th, 2025.
The first revision was submitted on May 2nd, 2025 and was reviewed by 3 reviewers and the Academic Editor.
A further revision was submitted on August 2nd, 2025 and was reviewed by 1 reviewer and the Academic Editor.
The article was Accepted by the Academic Editor on August 22nd, 2025.

Version 0.3 (accepted)

Luigi Di Biasi · Aug 22, 2025 · Academic Editor

Dear authors, thank you for addressing all reviewers' concerns.

[# PeerJ Staff Note - this decision was reviewed and approved by Mehmet Cunkas, a PeerJ Section Editor covering this Section #]

Reviewer 4 · Aug 19, 2025

Basic reporting

The introduction offers sufficient context and a coherent explanation, supported by current and pertinent references. In response to reviewer feedback, the authors have augmented the literature study to encompass newer transformer models, including RoBERTa, ALBERT, and ELECTRA. Minor grammatical issues have also been corrected and the figures have been improved for clarity.

Experimental design

The experimental design is consistent with the articulated aims. The authors furnish comprehensive accounts of:

- datasets (Davidson, ToxicTweets, HateSpeechDetection),
- preprocessing procedures (emoji elimination, stopword elimination, lemmatization, etc.),
- train/test division (80/20),
- assessment metrics (accuracy, precision, recall, F1 score),
- the suggested architecture (RoBERTa-Large for feature extraction combined with XGBoost for classification).

The detail provided is adequate for reproducibility.

Validity of the findings

Although there is still room for expansion (multilingualism, fairness, ablation study), the article in its current form satisfies the criteria of scientific validity.

Additional comments

The article is satisfactory and can be accepted; however, I suggest a minor editorial change, particularly a linguistic and stylistic evaluation and, if possible, a more explicit explanation of the limits of fairness.

A further correction on my part would not be necessary.

Cite this review as

Anonymous Reviewer (2025) Peer Review #4 of "Beyond words: a hybrid transformer-ensemble approach for detecting hate speech and offensive language on social media (v0.3)". PeerJ Computer Science

Download Version 0.3 (PDF) Download author's response letter - submitted Aug 2, 2025

Version 0.2

Luigi Di Biasi · Jul 20, 2025 · Academic Editor

Major Revisions

Dear Authors,

After thoroughly reviewing the reviewers' reports, I am confident that your work would benefit from substantial revision.

In particular, the experimental evaluation should be enhanced by including more recent and competitive models, along with other state-of-the-art approaches, to substantiate claims of superior performance. Reviewers emphasize the importance of replicating standardized evaluation protocols commonly used in the current literature to ensure comparability in dataset splits, metrics, and experimental setups. Also, reviewers strongly encouraged the inclusion of model interpretability tools, such as SHAP, LIME, or attention visualizations, given the sensitivity of the task.

Additionally, reviewers suggest that the claim of novelty in applying the model to the TT and HSD datasets should be supported through a brief literature review or relevant citations: a more explicit discussion of ethical considerations, including dataset bias and annotation subjectivity, would also strengthen the work.

It would be helpful to clarify the hyperparameter tuning methodology used, as well as provide more detail on strategies adopted to prevent overfitting, especially with the number of training epochs and any early stopping criteria.

Finally, although not mandatory, reviewers suggest that extending the evaluation to multilingual or cross-domain data would increase the generalizability of the proposed approach.

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

Reviewer 1 · May 15, 2025

Basic reporting

The paper is written clearly. The context is well-established, and the motivation for the study is clear. The author has reviewed and incorporated the necessary corrections.

Experimental design

The experimental design is well-structured, with a logical flow that effectively addresses the research objectives. The methodology is clearly described. Overall, the experimental framework demonstrates careful planning and scientific rigor, which strengthens the validity of the results.

Validity of the findings

The paper explicitly assesses the broader impact and novelty of the proposed approach, providing a thorough discussion of how this work advances the field beyond existing methods. The author has reviewed and incorporated the necessary corrections.

Cite this review as

Anonymous Reviewer (2025) Peer Review #1 of "Beyond words: a hybrid transformer-ensemble approach for detecting hate speech and offensive language on social media (v0.2)". PeerJ Computer Science

Reviewer 3 · Jul 5, 2025

Basic reporting

The manuscript is generally well written and articulate, with clear and professional language. The introduction provides adequate context and motivates the work convincingly, highlighting the social urgency associated with the spread of hate speech online. However, the absence of some key elements for full transparency and reproducibility is noted. In particular, no link to a source code repository is provided nor are the scripts for replicating the experiments made available, limiting the practical usefulness of the contribution for the scientific community.

In addition, although the methodology section is detailed, no qualitative examples of correct or incorrect classifications are presented, nor is there any analysis of the model's linguistic limitations (such as sarcasm, irony, or use of slang), which are known to be critical in hate-speech detection tasks. Adding such qualitative analyses would have made the work more complete. The absence of model interpretability tools (such as SHAP, LIME, or attention maps) is also a limitation, considering the sensitive and sensitive nature of the subject matter.

Experimental design

The proposed experimental design is consistent with the stated objectives and is fully within the intended application scope of the journal. The use of a hybrid model based on RoBERTa-Large for feature extraction and XGBoost for final classification represents a combination already explored in the literature, but here it is refined with good attention to hyperparameter optimization.

However, some structural weaknesses are observed. The model is evaluated exclusively on English-language tweets, without any cross-linguistic or cross-domain tests (e.g., Reddit, Facebook, YouTube). This limits the generalizability of the proposed method. In addition, the management of class imbalance, although mentioned as an issue in the Davidson dataset, is not systematically addressed through dedicated techniques such as SMOTE, focal loss or training set balancing. Finally, the choice of comparison models-while including classic ML approaches and some transformers-does not cover newer techniques (e.g., DeBERTa, T5, or zero-shot models), reducing the comparative relevance of the analysis. Missing ablation study.

Validity of the findings

One of the main methodological limitations of the present work concerns the comparison with the state of the art. The authors choose to compare their hybrid approach with some classical machine learning models (such as logistic regression, random forest, SVM), with some transformer models (such as DistilBERT and AngryBERT) and with relatively simple deep neural networks (CNN). However, the selected literature is partly dated and largely unrepresentative of the best solutions currently available for the hate speech detection task. In particular, state-of-the-art transformer models such as DeBERTa, T5, ELECTRA, RoBERTa fine-tuned on domain-specific corpora, or recent autoregressive models that have shown superior performance in text classification benchmark tasks are not considered. Moreover, references to zero-shot/few-shot approaches (e.g., GPT-3/4, TARS), automatic data augmentation techniques, and cross-domain strategies that represent the advanced state of research today are missing. Consequently, the proposed experimental comparison does not allow a rigorous determination of whether the presented model is indeed state-of-the-art, since it is evaluated against systems that no longer represent the best baselines. The metrics obtained, while promising, do not assume real comparative value without consistent replication of experimental protocols from more advanced and recent articles. It is imperative that authors replicate experimental protocols published in the most competitive research-using the same datasets, splits, metrics and configurations-to ensure objective and verifiable evaluation.

Cite this review as

Anonymous Reviewer (2025) Peer Review #3 of "Beyond words: a hybrid transformer-ensemble approach for detecting hate speech and offensive language on social media (v0.2)". PeerJ Computer Science

Reviewer 4 · Jul 10, 2025

Basic reporting

The paper is generally well-written and presented in clear, professional English. The article follows an organized structure with appropriate headings and includes all the essential components: abstract, introduction, literature review, methodology, experiments & results, and discussion & conclusion.
The English used is mostly clear and unambiguous, with a few minor issues in phrasing (e.g., redundancy or awkward constructions), but they do not hinder understanding. A light grammatical review could help improve fluency (e.g., "breed" in line 34 seems misplaced).
The literature review is extensive, with appropriate references to prior work in machine learning (ML), deep learning (DL), and transformer-based models. The paper also highlights limitations in existing approaches and builds motivation for the proposed method.
The definitions, tables, and figures are presented professionally. Some figures (e.g., confusion matrices) are useful but could benefit from clearer labeling or larger font sizes for readability (e.g., fig. 3; 5; 8; 9; 10).

suggestions:
- Minor grammatical cleanup is recommended throughout (e.g., plural/singular consistency, sentence length).
- Add a clearer articulation of how this work differs from past hybrid methods beyond tuning and dataset novelty.

Experimental design

The experimental design is methodologically sound and falls within the aims and scope of PeerJ Computer Science.
The authors use a transformer-based model (RoBERTa-large) for contextual feature extraction, combined with XGBoost for classification. The ensemble is well-motivated and is described in detail.
Three datasets are used — Davidson, HateSpeechDetection (HSD), and ToxicTweets (TT). TT and HSD are claimed as novel applications of the model, though Davidson is widely used.
Training configurations (e.g., learning rate, dropout, batch size, sequence length) and hardware used are clearly listed.

suggestions:
- Discuss ethical considerations more explicitly (e.g., bias in datasets, annotation subjectivity).
- Provide details on hyperparameter tuning methodology (e.g., grid search, cross-validation strategy).

Validity of the findings

The findings of the study are generally valid and well-supported by experimental evidence. The authors report comprehensive performance metrics—including accuracy, precision, recall, and F1-score—across three datasets, with comparisons against several established baseline models such as SVM, Bagging, LightGBM, CNN, and AngryBERT. The proposed hybrid model consistently outperforms these baselines, particularly on the Davidson dataset, and its effectiveness is further substantiated through visualizations like confusion matrices and bar charts. The manuscript also introduces the application of the model to the ToxicTweets (TT) and HateSpeechDetection (HSD) datasets as a novel contribution; however, this claim should be explicitly supported by a systematic literature review or citation to confirm its originality. Finally, the paper includes a thoughtful discussion of limitations, acknowledging challenges such as class imbalance, the absence of multilingual evaluation, and the exclusive use of English-language data, which may affect the generalizability of the findings.

To further strengthen the study’s validity and generalizability, several considerations should be addressed. First, the risk of overfitting needs to be explicitly discussed—particularly regarding the decision to train for only 5 epochs. It is unclear whether this choice was based on prior experimentation or guided by early stopping criteria; clarification is essential to assess the model's stability. Second, external validation on additional or more diverse datasets, such as newer collections or content from other social platforms, would enhance the robustness and applicability of the proposed approach beyond the current experimental setup. Lastly, the manuscript would benefit from a deeper discussion of model fairness. Specifically, the authors should consider potential performance disparities across demographic subgroups, as well as the model’s ability to detect more nuanced forms of hate speech, including sarcasm, coded language, or context-dependent abuse—factors that are critical in real-world deployment scenarios.

Additional comments

The paper presents several noteworthy strengths. The proposed architecture effectively combines the contextual understanding capabilities of RoBERTa with the classification robustness of XGBoost, resulting in a solid and well-integrated hybrid model. The paper is also grounded in a strong contextual motivation, highlighting the social importance of hate speech detection, and benefits from clear and informative visualizations such as confusion matrices and performance charts.
However, there are areas that could be enhanced. Light editorial refinement (particularly in grammar, phrasing, and transitions) would improve the overall readability and presentation. Lastly, future work could be enriched by exploring multilingual datasets or applying the model to content from different platforms, enhancing the model's generalizability across linguistic and social contexts.

Cite this review as

Anonymous Reviewer (2025) Peer Review #4 of "Beyond words: a hybrid transformer-ensemble approach for detecting hate speech and offensive language on social media (v0.2)". PeerJ Computer Science

Download Version 0.2 (PDF) Download author's response letter - submitted May 2, 2025

Version 0.1 (original submission)

PeerJ Staff · Mar 28, 2025 · Academic Editor

Major Revisions

Please submit your revised manuscript at your earliest convenience.

**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.

Reviewer 1 · Mar 7, 2025

Basic reporting

The paper is written in clear and professional English. The language is concise and appropriate for an academic audience. The context is well-established, and the motivation for the study is clear.
The literature review is comprehensive and covers a wide range of relevant studies in hate speech detection.

Experimental design

The study demonstrates a high level of technical rigor. The authors employ state-of-the-art models (RoBERTa-Large and XGBoost) and compare their approach against multiple baseline methods.
The evaluation methods are clearly described, with metrics such as accuracy, precision, recall, and F1-score used to assess model performance.

Validity of the findings

The paper does not explicitly assess the broader impact or novelty of the proposed approach. While the authors highlight the superior performance of their model, they do not discuss how this work advances the field beyond existing methods. A discussion on the potential societal impact of deploying such a model could strengthen the paper. The conclusions are well-stated and directly supported by the experimental results. The authors summarize the key findings, including the superior performance of their hybrid model (RoBERTa-Large + XGBoost) over baseline methods.

Additional comments

The paper is strong but would benefit from addressing the following points:
Expand the discussion on the broader impact and novelty of the proposed approach.
Provide more details on hyperparameter tuning and computational infrastructure.
Strengthen the discussion on ethical implications and societal impact.
Expand the limitations and future work section to provide more depth.

Cite this review as

Anonymous Reviewer (2025) Peer Review #1 of "Beyond words: a hybrid transformer-ensemble approach for detecting hate speech and offensive language on social media (v0.1)". PeerJ Computer Science

Reviewer 2 · Mar 18, 2025

Basic reporting

Introduction gives sufficient amount of background details, issues, and motivation to come up with their proposed work. It contains all the necessary information. Still there are some loop holes.
* Literature looks good. Still, you can prepare subsections or different paragraphs according to the models implemented for the task, it will improve the readability (For eg. ML, DL, and TL).

* In line 89 you have used accuracy and again in 128. It would be great if you maintained uniform representation.

* In the abstract, you have mentioned that your proposed method has achieved 97% accuracy and in the contribution (line no. 89) you have mentioned that the proposed model obtained 92.42% accuracy.

* The literature should suggest the research gaps in this task. If already enough methods were already implemented, then the question arises why have you attempted this work?

Experimental design

Exhastive experiments are not conducted and detailed study could increase the quality of the paper. Here are few points:
* The figure cited in the methodology (in line no. 199) is not matching with the label.

* In figure 4, you mentioned ensemble classifier, and at the same time you mention the name of only one classifier, i.e., XGBoost which is contradictory. Because if you use the term ensemble, it indicates that you are using more than one classifier.

* Emojis also convey some information. Instead of removing emojis, you can convert them into text.

* Did you check the results without removing stopwords and lemmatizer? As you have used transformer-based model, it captures the context if the sentence structure is there.

* You can avoid giving existing information, rather you can add how you have extracted features from RoBERTa-Large model in an algorithmic way.

* You can avoid explaining ML algorithms. Otherwise, you can just brief it and give more focus on the proposed approach.
* In Line no. 394, “While we experimented with a total of nine ML models.....” There is no information about which feature extraction method is used to train these ML models.

* Did conduct experiments with training all the other ML classifiers with RoBERTa-Large features?

* As the dataset is imbalanced, you could add some experiments related to balance the data and then conduct experiments to check the impact of this.

Validity of the findings

In the abstract, you claim that “....and compared against six state-of-the-art methods, including deep architectures (BERT, AngryBERT, CNN), transformers (Distil-BERT), and logistic regression.” But there are no results in the experiments section related to this.

Additional comments

A careful proofreading is required. There are few sentences (For eg. Line no. 204) which require proper punctuation to increase the readability of the paper.

Cite this review as

Anonymous Reviewer (2025) Peer Review #2 of "Beyond words: a hybrid transformer-ensemble approach for detecting hate speech and offensive language on social media (v0.1)". PeerJ Computer Science

Download Original Submission (PDF) - submitted Jan 27, 2025

All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Review History Beyond words: a hybrid transformer-ensemble approach for detecting hate speech and offensive language on social media

Summary

Version 0.3 (accepted)

Luigi Di Biasi · Aug 22, 2025 · Academic Editor

Reviewer 4 · Aug 19, 2025

Basic reporting

Experimental design

Validity of the findings

Additional comments

Version 0.2

Luigi Di Biasi · Jul 20, 2025 · Academic Editor

Reviewer 1 · May 15, 2025

Basic reporting

Experimental design

Validity of the findings

Reviewer 3 · Jul 5, 2025

Basic reporting

Experimental design

Validity of the findings

Reviewer 4 · Jul 10, 2025

Basic reporting

Experimental design

Validity of the findings

Additional comments

Version 0.1 (original submission)

PeerJ Staff · Mar 28, 2025 · Academic Editor

Reviewer 1 · Mar 7, 2025

Basic reporting

Experimental design

Validity of the findings

Additional comments

Reviewer 2 · Mar 18, 2025

Basic reporting

Experimental design

Validity of the findings

Additional comments

Review History
Beyond words: a hybrid transformer-ensemble approach for detecting hate speech and offensive language on social media