All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
The authors have addressed all of the reviewers' comments.
[# PeerJ Staff Note - this decision was reviewed and approved by Xiangjie Kong, a PeerJ Section Editor covering this Section #]
Please, revise and extend the experiment sections taking into account the reviewer's suggestion.
**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
The authors have validated the effectiveness of the proposed method through comparative experiments, however, the experiment section lacks detailed analysis of the results, which may weaken the persuasiveness of the method.
no comment
no comment
no comment
Dear authors,
Thank you for your resubmission of the manuscript “Adaptive multitask emotion recognition and sentiment analysis using resource-constrained MobileBERT and DistilBERT: an efficient approach for edge devices.”
After reviewing the revised manuscript and considering the updated evaluation from Reviewer 3, we acknowledge that significant improvements have been made. The manuscript now includes detailed ablation studies, comparative evaluation of imbalance-handling techniques, and enhanced methodological clarity. These revisions address many of the concerns raised in the initial review round.
However, based on the remaining limitations and in line with Reviewer 3's updated comments, we request a further round of major revisions. Below is a summary of the main outstanding issues:
1. Comparative Benchmarking
The manuscript still lacks direct empirical comparison against established baseline models such as BERT, RoBERTa, ERNIE, or other multitask emotion/sentiment classifiers. Stating performance gains without comparative results makes it difficult to assess the practical value of the proposed approach. A benchmark table summarizing key metrics across competing methods on MELD and IEMOCAP is expected.
2. Inference Efficiency and Deployment Metrics
Although model sizes and parameter counts are reported, there is no measurement of inference time, runtime latency, or memory footprint. Given that a major claimed contribution is suitability for edge deployment, these metrics must be quantified and reported (e.g., inference time per sample, RAM/VRAM usage).
3. Novelty Framing
The stated contribution—the combination of prototypical networks and focal loss for multitask learning on MELD/IEMOCAP—remains mostly architectural. Strengthening the positioning against recent multitask models and explicitly arguing how this method advances the field would help reinforce the contribution.
We therefore encourage a focused and final revision that includes:
• A direct performance comparison with baseline models.
• Experimental measurements of deployment efficiency (inference latency, memory usage).
• A concise positioning of the model’s contribution in relation to state-of-the-art methods, preferably in a dedicated subsection.
Please ensure that all changes are reflected both in the manuscript and in a detailed response letter.
We appreciate your efforts thus far and look forward to your revised submission.
-
-
-
Although the authors stated in their response that they are the first to utilise multi-task learning for text-based emotion recognition and sentiment analysis on the MELD and IEMOCAP datasets, the paper still lacks comparative experiments with other state-of-the-art models. It is recommended to include additional experiments to enhance the persuasiveness of the manuscript.
**PeerJ Staff Note:** Please ensure that all review and editorial comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
**Language Note:** The review process has identified that the English language must be improved. PeerJ can provide language editing services - please contact us at [email protected] for pricing (be sure to provide your manuscript number and title). Alternatively, you should make your own arrangements to improve the language quality and provide details in your response letter. – PeerJ Staff
-
The authors need to make lots of improvements. The problem statement is too broad, leading to vague experimental results. The authors need to clarify in terms of the dataset used for evaluation before claiming their results are good. At line 226, general pseudocode is given. A clear flowchart of the methods involved will help to provide a more structured way of presenting the experiment design. The primary concern will be the evaluation part of the model. Which dataset was used?
Suggestions for improvements.
1) Need a detailed explanation for each identified finding, especially on line 409 for Fig. 6 to Fig. 17.
2) To explain how to overcome the imbalanced datasets before conducting the experiments.
In this paper, the authors proposed a multitask learning framework for emotion recognition and sentiment analysis, leveraging MobileBERT and DistilBERT pre-trained models fine-tuned on two emotion-rich datasets: MELD and IEMOCAP. The work integrates prototypical networks to lean task-specific embedding and employs a focal weighted loss to address class imbalance. Data augmentation was applied using random word deletion to enhance model generalization.
Overall, I found the paper presents an interesting and timely approach, combining representation learning with multitask objectives. The use of prototypical networks in emotion and sentiment classification, and the application of focal loss to multitask learning in this context, are notable. However, the novelty is somewhat limited to the application and combination of existing methods. In addition, the writing could benefit from restructuring and clarity improvements, particularly in sections describing training procedures and analysis of results.
Despite its potential, this paper needs thorough revision before it can be accepted for publication. The paper has several weaknesses that could be improved as follows:
1. The writing style is often unclear, making it difficult to follow arguments being presented and easily understand the presented information.
2. The paper presents an interesting research topic, but the contributions can be strengthened. For example, one of the key contributions claimed by the authors is that "The proposed lightweight and efficient approach employs MobileBERT and DistilBERT to achieve state-of-the-art performance while significantly reducing computational overhead, making it suitable for deployment on mobile devices and edge platforms." However, the paper does not presents any empirical comparison of inference speed, memory usage or runtime performance against existing state-of-the-art or heavier models such as BERT, RoBERTa or even other distilled models.
The experimental setup is generally solid, but i identified certain areas related to the experimental design that could be improved:
1. The paper would benefit from a clearer presentation of the datasets’ label distribution and how the random deletion augmentation was validated.
2. The use of focal loss is appropriate for class imbalance, but no comparison is shown against other techniques like class weighting or oversampling. Including this would strengthen the justification for its use.
3. Another area that would improve the paper is a deeper analysis of the data augmentation strategy. The authors mention applying random word deletion to improve generalization, but no quantitative results are presented to demonstrate its impact. For example, report model performance before and after applying augmentation, to assess whether it improves robustness or performance on minority classes.
The results shown in the article seem valid, and sufficient information and data have been provided for meaningful replication. My only suggestion would be to better highlight the contributions in the conclusion.
This paper proposes a multi-task learning approach designed for deployment on resource-constrained devices, aiming to jointly perform Emotion Recognition and Sentiment Analysis. The authors utilize lightweight pre-trained language models (MobileBERT and DistilBERT) and integrate a prototypical network with a focal weighted loss function. The goal is to enhance model performance under conditions of class imbalance and limited computational resources. Experiments are conducted on two public datasets: MELD and IEMOCAP.
1. Lack of key experimental baselines: The paper does not provide comparative results against existing state-of-the-art methods, making it difficult to evaluate the effectiveness of the proposed approach.
2. Missing evidence in the ablation study: Although the ablation analysis is described in the main text, there is no supporting data or figures. The paper lacks comprehensive comparison tables showing performance without individual modules, such as the prototypical network, focal loss, or multitask setting, making it hard to quantify the contribution of each component.
3. Although the paper emphasizes the suitability of the proposed model for low-resource environments by leveraging lightweight models such as MobileBERT and DistilBERT, the experimental section lacks direct comparisons that illustrate the model’s advantages in such scenarios. The paper does not report essential metrics such as model size, inference latency, or memory usage. Without such evaluations, the claim that the model is lightweight and deployable remains unsubstantiated.
4. Writing and presentation need improvement: There are several spelling and language issues, such as the use of "conversion" instead of "conversation" in the abstract. In addition, the figures are somewhat redundant and do not effectively highlight the key contributions of the paper. Further refinement and optimization of the visual content are recommended.
5. Please ensure that all equations are punctuated properly for consistency and readability.
6. Some of the figures (e.g., confusion matrices and classification metrics) appear blurry or low in resolution, which affects readability. Please consider improving the visual quality of these images to enhance the presentation of results.
Paper Summary:
This paper proposes a multi-task learning approach designed for deployment on resource-constrained devices, aiming to jointly perform Emotion Recognition and Sentiment Analysis. The authors utilize lightweight pre-trained language models (MobileBERT and DistilBERT) and integrate a prototypical network with a focal weighted loss function. The goal is to enhance model performance under conditions of class imbalance and limited computational resources. Experiments are conducted on two public datasets: MELD and IEMOCAP.
Strengths:
1. Practical relevance of the topic: The paper targets the problem of multi-task affective understanding on edge devices. The objective is clear and aligns well with current demands for intelligent terminal deployment.
2. Methodological novelty: The integration of prototypical networks and focal weighted loss for multi-task learning is a novel attempt to improve performance under class imbalance and computational constraints.
No commnets
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.