All reviews of published articles are made public. This includes manuscript files, peer review comments, author rebuttals and revised materials. Note: This was optional for articles submitted before 13 February 2023.
Peer reviewers are encouraged (but not required) to provide their names to the authors when submitting their peer review. If they agree to provide their name, then their personal profile page will reflect a public acknowledgment that they performed a review (even if the article is rejected). If the article is accepted, then reviewers who provided their name will be associated with the article itself.
Dear Author,
Your paper has been revised. It has been accepted for publication in PeerJ Computer Science. Thank you for your fine contribution.
[# PeerJ Staff Note - this decision was reviewed and approved by Vicente Alarcon-Aquino, a PeerJ Section Editor covering this Section #]
-
-
-
I've read through all the answers and comments. I can say that the authors have made a huge effort to improve the manuscript.
**PeerJ Staff Note:** Please ensure that all review, editorial, and staff comments are addressed in a response letter and that any edits or clarifications mentioned in the letter are also inserted into the revised manuscript where appropriate.
Good paper! Thank you team for sharing!
I think that the manuscript is generally written in professional and technically appropriate English. The terminology used is consistent with cybersecurity and machine learning literature. The paper includes a comprehensive and well-structured related work section, appropriately citing both classical and recent works (e.g., ET-BERT, FG-Net, GraphDApp). The gaps in prior art are well articulated.
Areas for Improvement:
-While the literature review is broad, a more critical synthesis of how RDNet specifically advances the state-of-the-art (beyond performance metrics) would improve the contextual positioning.
- Consider annotating Figure 2 with clearer stage-wise labels to aid comprehension of model flow.
-The research question is well-defined and addresses a crucial gap: detecting malicious encrypted traffic under active/passive obfuscation. The novelty of RDNet lies in combining statistical, temporal, and spatial features via CNN, BERT, and GAT.
- The proposed architecture (RDNet) is thoughtfully designed. It innovates by introducing an adaptive sliding window (ASW) and an improved PLD algorithm for noise mitigation, which are well justified.
-Details about model architecture, training setup (e.g., layers, optimizers), datasets, and codebase are well presented.
Areas of Improvements:
-The justification for hyperparameters (e.g., top-k in PLD, ∆τ2 threshold) could be more rigorous—perhaps via sensitivity analysis.
-Consider expanding on how the thresholds δ and Tmax are chosen for the adaptive window algorithm.
- The Algorithms 1 and 2 are useful but could benefit from pseudocode notation conventions for clarity.
- The authors use real-world datasets, including the MTA dataset, which contains both obfuscated and non-obfuscated encrypted traffic—a strong choice for evaluating robustness.
-Ablation studies, module analysis, and adaptive vs. fixed window comparisons convincingly demonstrate the contributions of each module.
Areas for Improvement:
- Include standard deviations or confidence intervals in performance metrics (especially for Fig. 7–10 comparisons).
- Add a statistical significance test (e.g., McNemar’s test) when comparing RDNet against SOTA methods like ET-BERT.
-The discussion section could better highlight practical implications (e.g., real-time deployment feasibility, computational overhead of BERT + GAT).
Strengths of the manuscript:
-Strong motivation and relevance in the domain of encrypted traffic analysis.
-Clear methodological innovation (PLD + ASW + BERT-GAT fusion).
-Comprehensive experimental evaluation and thorough baselines.
Areas od improvements:
-Need for clearer statistical significance validation.
-Improve clarity in the explanation of algorithm thresholds and parameter selection.
- Provide runtime benchmarks and memory requirements, especially with BERT and GAT components.
Please check the review report.
Please check the review report.
Please check the review report.
Please check the review report.
The manuscript is generally well-written and organized, but some sentences are verbose and would benefit from stylistic refinement for conciseness and clarity.
The abstract effectively outlines the problem and contributions, though it could more clearly highlight the novelty in comparison with prior art.
The figures (e.g., Fig. 2) are clear, informative, and aid in understanding the methodology, though figure captions could be more descriptive.
References are mostly relevant and current. However, citations to recent literature beyond 2023 on encrypted traffic detection using GNNs and transformers (outside China) could be considered for broader context.
Raw data and implementation details (e.g., dataset access, hyperparameters) are mentioned, but the reproducibility could be improved by sharing code and configurations, especially for RDNet architecture and data preprocessing pipelines.
Some grammatical and typographical issues remain, e.g., “blind fixed time windows” is repeated multiple times and could be rephrased for clarity.
The study presents a comprehensive and well-defined research problem—detecting malicious encrypted traffic under obfuscation, both passive and active.
The methodology is novel, combining PLD-based time series standardization, adaptive burst graph modeling, and spatial-temporal fusion using CNN, BERT, and GAT.
The architecture (Fig. 2) is clear, but the actual structure (layers, sizes, dropout, etc.) of each component should be more precisely detailed. Table 1 is helpful, but some notation is over-complicated (e.g., repeated use of mathematical notation where textual description would suffice).
The adaptive sliding window algorithm is well-motivated but lacks theoretical justification or complexity analysis (e.g., in terms of runtime over large flows).
Evaluation includes ablation and robustness tests. However, more diverse attack scenarios and background traffic conditions would further validate generalizability.
Comparison with strong baselines (ET-BERT, GraphDApp) is appreciated, but lacks rigorous statistical significance testing (e.g., confidence intervals or variance).
The results clearly show RDNet outperforming baselines on the MTA dataset and performing competitively on traditional datasets (CIC-IoT, Stratosphere).
Robustness against noise is tested via synthetic perturbation, though this might not fully represent realistic network variability (e.g., packet reordering, congestion-induced delay).
Ablation results confirm that each component contributes meaningfully to performance; however, the BERT module’s improvement could be further dissected (e.g., using attention maps or feature importance).
Generalizability is supported, but more datasets from different network environments would strengthen the findings.
The classification metrics are appropriate for the multiclass problem, but confusion matrices or ROC curves could offer more insight into specific classes' behavior (e.g., false positives in noisy traffic).
Consider releasing the RDNet model and dataset splits publicly to promote reproducibility and community benchmarking.
The novelty of integrating PLD and GAT-BERT in encrypted traffic analysis is commendable, but the manuscript could better position RDNet relative to concurrent transformer-GNN hybrids in other cybersecurity tasks.
Clarify whether real-time or near-real-time operation is feasible given the architecture’s complexity and computational requirements.
The title is accurate, though it may benefit from explicitly referencing “BERT” or “GNN” to attract readers from those subdomains.
The discussion could be expanded to reflect on limitations, e.g., reliance on burst structure, potential evasion by adversarial samples, or limitations in scalability to high-speed networks.
All text and materials provided via this peer-review history page are made available under a Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.