EDCRAN: an enhanced dual-channel residual attention network for real-time DDoS attack detection in IoT

Abdulaleem Ali Almazroi

doi:10.7717/peerj-cs.3531

EDCRAN: an enhanced dual-channel residual attention network for real-time DDoS attack detection in IoT

Department of Information Technology, Faculty of Computing and Information Technology in Rabigh, King Abdulaziz University, Rabigh, Saudi Arabia

DOI: 10.7717/peerj-cs.3531

Published: 2026-02-03
Accepted: 2025-12-05
Received: 2025-04-17

Academic Editor: Sedat Akleylek

Subject Areas: Artificial Intelligence, Computer Networks and Communications, Security and Privacy, Neural Networks, Internet of Things
Keywords: DDOS attack, Deep learning, Cyber security, Internet of things

Copyright: © 2026 Almazroi
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.

Cite this article: Almazroi AA. 2026. EDCRAN: an enhanced dual-channel residual attention network for real-time DDoS attack detection in IoT. PeerJ Computer Science 12:e3531 https://doi.org/10.7717/peerj-cs.3531

The author has chosen to make the review history of this article public.

Abstract

Cybersecurity is becoming important in healthcare, transportation, and urban infrastructure due to the increased implementation of Internet of Things (IoT) technology. Distribution Denial of Service (DDoS) attacks, which use IoT edge devices’ limited capacity to iterrupt systems, are a major issue. Scalability issues, ineffective feature extraction, and high computing overhead make traditional Intrusion Detection Systems (IDS) unsuitable for real-time operation on restricted devices. Enhanced Dual Channel Residual Attention Network (EDCRAN), a deep learning architecture for accurate and efficient DDoS detection, is introduced in this study to solve these constraints. The Transform Attention Channel (TAC) catches global temporal patterns in network data, whereas the Scaled Convolutional Channel (SCC) focusses on localised, context-aware characteristics. The model learns rich and discriminative representations from varied traffic sources using this dual-channel technique. To address class imbalance, a Dynamic Weighted Balancing Strategy (DWBS) adjusts learning contributions across classes, while Selective Hybrid Extraction (SHE) filters irrelevant features to improve generalisation. Offline tuning with the Golden Jackal Optimisation (GJO) method produces a lightweight model for IoT edge devices. In the IN-DDoS24 dataset, EDCRAN outperforms fifteen benchmark models with 98.9% accuracy, 99.5% AUC, and 22 s inference time. To assess temporal consistency and attack resilience, two new metrics CPS and ACWS are provided. The realistic, scalable EDCRAN solution detects DDoS in real time in next-generation IoT environments.

Introduction

The Internet of Things (IoT) is crucial to healthcare, transportation, farming, and smart cities. Distribution Denial of Service (DDoS) attacks pose a significant danger to IoT ecosystems due to the rapid exchange of data across billions of devices (Sun et al., 2019). Current occurrences, such as the Tbps-scale Mirai assaults, demonstrate the potential for stolen IoT devices to form massive botnets, disrupting critical internet services (Sun et al., 2018). New botnets, such as IoT Reaper, Torii, and Satori, exploit vulnerabilities in routers, webcams, and embedded devices, necessitating more adaptable detection approaches (Liu et al., 2025; Zeng et al., 2025; Tang et al., 2025; Wu et al., 2025). Rule-based intrusion detection systems struggle to keep up with DDoS attacks since they cannot adapt and have fixed signatures. Thus, more individuals are protecting the Internet of Things using Long Short Term Memorys (LSTMs), Recurrent Neural Networks (RNNs), and Conventional Neural Networks (CNNs). The algorithms identified detailed traffic patterns (Rafique et al., 2024; Bukhowah, Aljughaiman & Rahman, 2024). The CICIoT-2023 and TON-IoT datasets offer numerous attack scenarios, thereby simplifying model creation (Saiyed, Al-Anbagi & Hossain, 2024; Zhou et al., 2025; Zhang et al., 2024c; Xu et al., 2025). However, processing costs are considerable, features are often unneeded or overly noisy, and it’s challenging to fit large models on the limited number of Internet of Things devices. Because irrelevant features can mislead classifiers, it is crucial to efficiently select features and reduce dimensionality (Alshdadi et al., 2024).

Quick processing near the data source is also necessary for real-time detection. As billions of devices continually create traffic, centralized systems experience delays (Mahadevappa et al., 2024). On the other hand, edge-based anomaly detection provides decreased latency and enhanced responsiveness. However, many current models are either too cumbersome to implement at the edge or are unable to adapt to the highly unpredictable and uneven nature of IoT traffic, despite these benefits. That is why there has been a lot of focus on lightweight IDS systems (Maiwada et al., 2024). New approaches include attention-based models, such as DDOS Vision Transformer (DDoSViT), small CNN-LSTM hybrids, and networks inspired by MobileNet. Likewise, methods such as adaptive federated learning, Synthetic Minority Oversampling Technique (SMOTE) variations, and Open-Set Recognition have been used to tackle class imbalance (Anley et al., 2024). However, these systems are not particularly effective at capturing long-range temporal relationships, as well as localised burst patterns, and their performance deteriorates when traffic distributions change rapidly.

Lightweight, time-aware, and imbalance-resistant intrusion detection models are necessary for real-world IoT deployments due to these restrictions. Our proposed Enhanced Dual-Channel Residual Attention Network (EDCRAN) is a deep learning architecture that can efficiently and accurately identify DDoS attacks on devices with limited resources. To describe larger temporal relationships, EDCRAN uses Transform Attention Channels (TAC), whereas Scaled Convolutional Channels (SCC) capture fine-grained local patterns. The model can learn both short-term fluctuations and long-range contextual behaviour more successfully than single-stream architectures because of its dual-path structure. To address the extreme class imbalance commonly found in IoT datasets, the Dynamic Weighted Balancing Strategy (DWBS) is employed, thereby further enhancing resilience. The Selective Hybrid Extraction (SHE) module improves feature quality by reducing duplication and highlighting beneficial properties. Moreover, reliable performance is supported in low-power edge situations by efficiently tuning the hyperparameters offline using Golden Jackal Optimisation (GJO). This study’s main contributions are listed below, in line with previous research on lightweight intrusion detection systems and imbalance handling:

1.

IoT-based DDoS detection struggles to capture fine-grained local patterns and long-range traffic interdependence. EDCRAN uses a Scaled Convolutional Channel (SCC) for local feature extraction and a Transform Attention Channel (TAC) for global traffic modelling to address this. These routes and residual attention processes allow the model to learn complicated, multi-dimensional network traffic behaviours that conventional architectures miss.
2.

Biased learning and poor detection of uncommon attacks result from imbalanced class distributions, when benign traffic dominates malignant occurrences. Dynamic Weighted Balancing Strategy (DWBS) addresses this. Adaptively reweighting class contributions and selectively generating synthetic examples for under-represented attack types. Thus, the model improves stealthy or occasional threat sensitivity while maintaining robust generalisation.
3.

Noise, redundancy, and weak relevance can degrade model performance and increase computational cost, especially in resource-constrained IoT scenarios. EDCRAN’s Selective Hybrid Extraction (SHE) method filters low-utility inputs via statistical and nonlinear feature selection. Keeping only the most informative features improves detection accuracy and makes the model lightweight and efficient.
4.

When implemented on edge devices, tuning deep learning models for real-time inference requires significant resources. The model uses Golden Jackal Optimisation (GJO), a computationally efficient metaheuristic algorithm, to adjust the learning rate and dropout ratio. This speeds training convergence and eliminates post-deployment adjusting.
5.

Standard evaluation metrics rarely show how well a model performs over time or under different attack intensities. To solve this gap, two unique metrics are introduced: Contextual Prediction Stability (CPS), which measures prediction temporal consistency, and assault Complexity-Weighted Score (ACWS), which measures detection performance across assault sophistication. These indicators illuminate the model’s real-world reliability and resilience.

The EDCRAN model was trained and tested using the IN-DDoS24 dataset, which represents a diverse range of actual distributed denial-of-service (DDoS) attack scenarios targeting IoT networks. To see how well the algorithm can detect various traffic behaviors and danger patterns, this dataset is ideal. An extensive sensitivity test is conducted on critical hyperparameters to strengthen the model’s reliability further. Among these hyperparameters are the batch size, learning rate, and dropout rate. This test guarantees that the model will maintain consistent and stable operation regardless of the tuning parameters. These design decisions and assessment methodologies demonstrate EDCRAN’s strength, suggesting that it is a dependable solution for protecting future IoT systems from dynamic cyberattacks.

Recent research has yielded lightweight CNN-LSTM models, MobileNet-based feature extractors, Vision Transformer-based detectors, and SMOTE-based algorithms for addressing imbalances; however, most of these systems still have significant issues. The short-lived burst patterns and long-range temporal linkages of Internet of Things traffic may be complex to depict. This is traffic-related. They also lack a framework to correct imbalances, improve operations, and sustain the efficiency of edge deployment. All of these sections are combined in EDCRAN. More specifically, it uses the adaptive Dynamic Weighted Balancing Strategy (DWBS), SHE, and dual-path attention–convolution processing (TAC and SCC) to optimize features non-linearly. This comprehensive approach overcomes oversmoothing, biased learning, duplicated features, and excessive computing costs. Its framework provides a thorough and methodical overview of the outcomes. It begins by reviewing the current methods for detecting DDoS attacks in IoT networks, highlighting key findings and identifying areas that remain unexplored. The following section describes the architecture of the suggested EDCRAN framework, including preprocessing and model improvements. Simulation-based assessments give execution time, classification accuracy, and comparisons to state-of-the-art techniques. The study concludes by reviewing the most notable results and discussing strategies to enhance the flexibility and scalability of real-time Internet of Things network security architecture.

Related work

The possibility of DDoS attacks on network systems has produced research on improved detection methods. The complexity and frequent evolution of new threats challenge statistical and signature-based solutions. Recent research has employed machine learning and deep learning frameworks to identify DDoS attacks with precision and scalability. Hybrid architectures, LSTM networks, and CNNs may uncover patterns in massive, changing datasets. Despite these gains, unbalanced datasets, real-time performance, and novel or hidden attack strategies need further research. Researchers use adaptive and ensemble methods, feature selection, and optimization to improve detection and reduce processing time. The following section reviews key contributions to addressing these concerns. The authors of Anley et al. (2024) introduced a DDoS detection system that enhances classification accuracy across datasets using transfer learning and CNN. Their method identified hazardous activities and attack groups better than baseline models. Its dependence on high-quality datasets and the computational expense of developing adaptive designs limited its real-time usefulness. Hidden attack resilience was another issue. In Wanda & Hiswati (2024), Deep Belief Networks (DBN) were introduced as an alternative to traditional machine learning for automated feature representation. DBNs, based on Restricted Boltzmann Machine layers, may extract hierarchical characteristics from traffic data. In complex attack settings, the model enhanced DDoS detection accuracy with low loss. Scalability to larger datasets was not thoroughly examined, and its greedy layer-wise training increased computer complexity.

In order to identify and counteract DDoS attacks in real-time cloud environments, DeepDefend employed a CNN-LSTM-Transformer hybrid model, as detailed in Ouhssini et al. (2024). By merging evolutionary algorithms for optimal feature selection with deep learning for entropy prediction, our strategy demonstrated high accuracy on the CIDDS-001 dataset. Despite its impressive results, this integrated approach requires a significant amount of resources, making it challenging to employ for large-scale, real-time applications. The authors of Hekmati et al. (2024) emphasized the need for monitoring traffic patterns between nodes to detect IoT DDoS assaults. Moreover, it further discussed correlation-aware designs and found significant performance increases when utilizing transformer and LSTM models, particularly in cases where attackers repeated harmless data. Although the correlation-based strategy performed effectively, it raised concerns about its efficacy in IoT networks with restricted resources. According to Shieh et al. (2024), the CNN-Routing Protocol for Low-Power and Lossy Networks (CNN-RPL) model uses CNN and RPL to detect both known and new DDoS threats. Its excellent accuracy rates, at 99.93% for known attacks and 98.51% for unknown threats, made the model durable. Open-Set Recognition (OSR) was challenging to set up and required extensive testing on big datasets; hence, it couldn’t be utilized on larger datasets. Another study used fuzzy Q-learning and Chebyshev’s inequality to predict cloud workloads and prevent low-rate DDoS attacks. The technique anticipated server loads and discovered concealed behavior, reducing false positives. However, further study is required to assess its effectiveness in diverse and dynamic cloud systems, especially in multi-tenant environments (Kumar, Dutta & Pranav, 2024). Across unbalanced datasets, the adaptive federated learning framework (FLAD), as suggested in Doriguzzi-Corin & Siracusa (2024), improved convergence times and DDoS detection accuracy. FLAD removed centralized test data by dynamically assigning computation to challenging attack scenarios. The accuracy is improved; however, there is still a need for further study into the issue of scalability in large-scale Internet of Things and cross-dataset contexts. The first description of lightweight cybernetic models was in Salih & Abdulrazaq (2023). These models utilized CNN-LSTM architectures that reduced processing overhead by employing fewer than 225,000 learnable parameters. On the imbalanced CIC-DDoS2019 dataset, the models achieved an impressive 99.99% accuracy. However, it is not used in as many scenarios since it is not thoroughly evaluated in real-life contexts, which are constantly changing.

In Ali et al. (2025), a model called DDoSViT is introduced to enhance DDoS detection in IoT environments, particularly for safeguarding Over-The-Air (OTA) firmware updates. This model is built on Vision Transformer (ViT). The model can identify both subtle and high-volume DDoS attacks, leveraging the self-attention characteristics of Vision Transformers to capture temporal patterns in network data effectively. Using a wide variety of IoT datasets, the model demonstrated exceptional performance in detecting fraudulent traffic. A significant decrease in the frequency of false positives accompanied an increase in accuracy. Despite encouraging performance metrics, transformer-based designs are too complex to be useful in low-power edge contexts, and real-time deployment remains a significant challenge. Using ensemble Graph Neural Networks (GNNs), a method for identifying DDoS attacks was proposed in a related study (Bakar et al., 2024). This approach improved the F1-score by 3.2% when compared to baselines using a combination of hierarchical GNNs and ensemble approaches such as boosting and bagging. The process was not scalable, even though it improved performance. The computational burden was considerably raised because of the reliance on ensemble learning and substantial hyperparameter tuning, which made real-world deployment more complex. The MMEDRL-ADM framework, suggested in Paidipati, Kurangi & Uthayakumar (2024), aims to detect DDoS attacks in Software-Defined Networking (SDN) settings and is another significant contribution. This design makes use of state-of-the-art metaheuristic algorithms to optimise an ensemble reinforcement learning approach. For feature selection, we use Grasshopper Optimisation, and to fine-tune the model hyperparameters, we use African Buffalo Optimisation. While these methods operate together to improve detection accuracy, there are obstacles to scalability and deployment caused by the framework’s complexity and resource requirements. These approaches had good detection performance, but SDN-based datasets’ computational intensity and restrictions limited their use. In Benmohamed, Thaljaoui & Elkhediri (2024), the E-SDNN design improves detection accuracy with 99.94% on CICDS2017 and 98.86% on CICDDoS2019 using stacked multi-layer perceptrons in an encoder-based framework. However, its vast feature selection and lack of real-time validation hindered its real-world usability.

The Federated Learning-based FL-DAD approach (Alhasawi & Alghamdi, 2024) introduced a decentralised perspective for privacy-preserving DDoS detection in IoT networks. This convolutional neural network-based technique improved detection without centralised data collecting. FL-DAD performed well on the CICIDS2017 dataset, but higher-scale IoT deployments are needed to assess its scalability and computational efficiency. The MSCBL-ADN model in Yin, Fang & Liu (2024) uses bi-directional LSTM and multi-scale CNN architectures to protect against low-rate DDoS attacks. The model achieved 96.74% accuracy on the ISCX-2016-SlowDos dataset using a hybrid feature extraction process with sliding window methods and arbitration networks. Even with these results, its adaptability was constrained because the architecture was tightly connected to dataset structures. A study (Shukla, Krishna & Patil, 2024) found that leveraging IoT traffic for DDoS detection resulted in an 84.4% reduction in feature dimensionality and a 5.19% increase in classification accuracy compared to earlier models. The framework improved multiclass classification with cross-validation. It needs validation under real-world IoT traffic scenarios and scalability improvements across various deployment environments. A Temporal Convolutional Network (TCN)-based frequency monitoring approach was proposed in Akhi, Eising & Dhirani (2025) to prevent DDoS assaults in 5G Healthcare-IoT networks. By merging adaptive thresholding with temporal feature extraction, the approach has the potential to uncover anomalous traffic frequency patterns. It demonstrates a high level of sensitivity and resilience to aggression. To secure critical IoT equipment in healthcare, research emphasizes the importance of capturing temporal dynamics. It does, however, highlight specific problems that may arise when applying the approach to a broader range of IoT applications and addressing the computing demands of real-time operations. An outline of the existing literature is shown in Table 1.

Table 1:

Summary of the literature review.

Ref	Objective	Methodology	Key findings	Limitations
Anley et al. (2024)	Propose adaptive transfer learning with CNN for DDoS detection.	Leveraged CNNs and transfer learning on publicly available datasets.	Achieved high accuracy for attack detection and classification.	High computational cost; limited robustness to unseen attack patterns.
Wanda & Hiswati (2024)	Detect DDoS attacks using Deep Belief Networks.	Utilized DBN with Restricted Boltzmann Machines for feature representation.	Enhanced detection accuracy in complex attack settings.	Scalability and training complexity were major concerns.
Ouhssini et al. (2024)	Real-time DDoS detection in cloud using hybrid deep models.	Combined CNN, LSTM, and Transformer with genetic algorithm-based feature selection.	Achieved high accuracy on CIDDS-001 dataset.	High resource demands challenge scalability.
Hekmati et al. (2024)	Detect DDoS attacks using correlation-aware models in IoT.	Used LSTM and Transformer architectures to model traffic across nodes.	Improved performance in detecting disguised attacks.	Computational burden limits use in resource-constrained environments.
Shieh et al. (2024)	Detect known and unknown DDoS attacks using CNN-RPL.	Combined CNN with Reciprocal Points Learning and Open-Set Recognition.	Achieved over 99.93% accuracy for known and 98.51% for unknown attacks.	Complex OSR implementation and limited large-scale testing.
Kumar, Dutta & Pranav (2024)	Predict server load to mitigate low-rate DDoS attacks.	Used fuzzy Q-learning with Chebyshev’s inequality for proactive detection.	Reduced false positives and detected stealthy traffic.	Requires validation in diverse cloud scenarios.
Doriguzzi-Corin & Siracusa (2024)	Improve FL-based DDoS detection on unbalanced datasets.	Proposed adaptive federated learning with task-specific partitioning.	Achieved faster convergence and high accuracy.	Cross-dataset generalization remains a challenge.
Salih & Abdulrazaq (2023)	Design lightweight CNN-LSTM for cyber DDoS detection.	Implemented under 225k learnable parameter architecture.	Reached 99.99% accuracy on CIC-DDoS2019 dataset.	Needs real-world testing for generalizability.
Ali et al. (2025)	Enhance OTA firmware protection in IoT via DDoSViT.	Applied Vision Transformer to model temporal traffic patterns.	Reduced false positives and improved precision.	Transformer complexity impairs edge deployment feasibility.
Bakar et al. (2024)	Boost DDoS detection using ensemble GNNs.	Combined hierarchical GNNs with bagging and boosting.	Improved F1-score by 3.2% over traditional models.	Requires extensive hyperparameter tuning and high compute resources.
Paidipati, Kurangi & Uthayakumar (2024)	Detect DDoS in SDN using reinforcement learning.	Integrated MMEDRL-ADM with African Buffalo and Grasshopper optimization.	Achieved strong performance in SDN datasets.	Real-time feasibility and dataset availability are issues.
Benmohamed, Thaljaoui & Elkhediri (2024)	Increase detection on benchmark datasets with E-SDNN.	Used encoder-stacked DNNs with automated feature selection.	Scored 99.94% (CICIDS2017) and 98.86% (CICDDoS2019).	Needs online validation and reduced feature engineering overhead.
Alhasawi & Alghamdi (2024)	Promote privacy-aware DDoS detection in IoT.	Used CNN within a federated learning (FL-DAD) framework.	Enabled decentralized detection with good performance.	Needs study on scalability in dense IoT deployments.
Yin, Fang & Liu (2024)	Detect low-rate DDoS with multi-scale CNN-biLSTM.	Employed sliding windows and arbitration networks.	Achieved 96.74% on ISCX-2016-SlowDos.	Model tied to specific dataset configuration.
Shukla, Krishna & Patil (2024)	Perform multi-class DDoS detection on IoT traffic.	Applied feature reduction with cross-validation.	Reduced features by 84.4% and improved accuracy by 5.19%.	Needs broader testing in live IoT scenarios.
Akhi, Eising & Dhirani (2025)	Develop a robust DDoS detection strategy in 5G Healthcare-IoT environments.	Proposed a monitoring frequency-based detection system using Temporal Convolutional Networks (TCNs) and a dynamic thresholding mechanism. Simulated traffic using Cooja and ns-3 with MQTT and UDP protocols.	Achieved 99.98% accuracy for detection and 95–99.99% prediction accuracy across MQTT and UDP datasets. Effectively mitigated 80% of DDoS nodes and outperformed BiLSTM and CNN-based models.	Limited real-world deployment; effectiveness in highly heterogeneous H-IoT scenarios yet to be tested.

DOI: 10.7717/peerj-cs.3531/table-1

Problem statement

A significant threat to network security, DDOS has increased in tandem with the proliferation of Internet of Things devices. Modern distributed DDoS assaults are complex and dynamic, making it difficult to control them using static statistical and signature-based approaches. Since these attacks employ low-rate, high-volume, and covert techniques, we require more robust detection systems that can effectively navigate the challenges posed by the Internet of Things. Recent advancements in deep learning and machine learning have led to the development of CNNs, LSTM networks, and hybrid models capable of detecting DDoS attacks. When analysing attack patterns, DBN-based frameworks (Wanda & Hiswati, 2024) and CNN-LSTM-Transformer models (Ouhssini et al., 2024) demonstrate a high level of accuracy. Many models struggle with scaling because they are not well-suited for handling large datasets and various types of IoT traffic. Though they were restricted in their ability to generalize, the lightweight models given in Salih & Abdulrazaq (2023) demonstrated remarkable accuracy in actual applications. Using computationally costly ensemble techniques, including GNN-based approaches, in Internet of Things contexts with limited resources is not practicable (Bakar et al., 2024) due to their significant resource demands. The computational requirements of MMEDRL-ADM (Paidipati, Kurangi & Uthayakumar, 2024) and CNN-LSTM-Transformer (Ouhssini et al., 2024) restrict their application in vast Internet of Things networks, while their detection rates show promise. Since malicious traffic is more prevalent than benign traffic, the situation is far worse, and the imbalance in the statistics of IoT traffic is even more apparent. The DDoSViT-based framework (Ali et al., 2025) and the Federated Learning-based DDoS detection (Alhasawi & Alghamdi, 2024) are two examples of attempts to rectify this imbalance. More advanced and adaptable balancing techniques are nevertheless needed to ensure accurate detection in various attack scenarios.

Recent DDOS threats, such as zero-day attacks and advanced persistent threats (APTs), further aggravate the situation. Even though Open-Set Recognition (OSR) frameworks (Shieh et al., 2024) have demonstrated that they are resistant to unknown attacks, scalability and validation on larger datasets remain issues. In addition, Internet of Things devices sometimes have resource limitations, which makes it challenging to build models that need a lot of processing power. Despite adaptive federated frameworks (Doriguzzi-Corin & Siracusa, 2024) and lightweight architectures (Salih & Abdulrazaq, 2023; Huang & Wang, 2025) partially addressing this problem, it remains crucial to develop resource-efficient solutions for low-power Internet of Things installations. This requires efficient feature selection and optimization to achieve the goal of improving detection accuracy while simultaneously reducing computational cost. It has been established that timestamp-based clustering (Akhi, Eising & Dhirani, 2025) and the DDoSViT-based architecture (Ali et al., 2025) are effective in improving performance; however, both methods need considerable preprocessing and manual setup, which restricts their use in dynamic Internet of Things contexts. Despite these advancements, a shortage of specialized detection techniques for the IoT remains, which are scalable, real-time, and resource-efficient. Current models cannot generalize to a wide variety of Internet of Things contexts and datasets; innovative solutions are necessary. It is vital to have real-time capabilities, adaptable architectures, and advanced feature selection to overcome these restrictions. To facilitate future research and applications, these frameworks must be resilient, efficient, and responsive to the ever-changing threats posed by the IoT.

Proposed methodology

The novel EDCRAN architecture detects and mitigates DDoS attacks in IoT networks. This approach accurately and efficiently classifies harmful actions, despite the multidimensional and heterogeneous nature of IoT traffic data. Extensive data preprocessing, dynamic feature selection, and a hybrid dual-channel architecture are used to maximize performance. The framework is trained and verified under realistic settings using the IN-DDoS24 dataset of DDoS attack scenarios. In addition, a unique Dynamic Weighted Balancing Strategy (DWBS) addresses dataset imbalance, and Selective Hybrid Extraction (SHE) refines feature selection to improve classification accuracy and computing efficiency. Figure 1 shows how our system model is presented architecturally.

Figure 1: Proposed method architectural representation.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-1

Dataset description

This study uses the Kaggle-hosted open-source IN-DDoS24 dataset. The dataset shows real IoT network traffic from a number of various types of devices functioning normally. It includes a number of distinct DDoS assault scenarios. It is a suitable option for evaluating modern intrusion detection algorithms since it is both realistic and varied.

Data collection and labeling: The packets from the network were organized into flows using the conventional 5-tuple. The protocol, source and destination IP addresses, and ports were all part of this. To label each flow, we utilized scripts that generated attacks, device logs, and execution records, which were time-stamped, that we obtained during the compilation of the dataset. This ensured that legitimate ground truth was used to classify both positive and negative traffic features.

Feature set: This dataset encompasses a wide range of data types, including statistical, temporal, and protocol-level information. Intrusion detection in the IoT is its primary function. All the attributes used in this inquiry are detailed in Table 2. In addition to features, this table also includes descriptions and classifications.

Table 2:

Comprehensive overview of the dataset’s features.

S. No	Features of the dataset	Features short description
1	Inter-arrival time	Measures the duration between consecutive packets in the traffic flow.
2	Device type	Categorizes the IoT device generating traffic (e.g., Sensor, Camera).
3	Timestamp	Records the specific time and date of the network activity.
4	Attack duration	Represents the length of time the detected attack persisted in seconds.
5	Source port	Refers to the port number utilized by the originating device.
6	Operating system	Specifies the OS of the IoT device (e.g., Linux, RTOS).
7	Packet size	Captures the total size of an individual packet in bytes.
8	Bytes in flow	Details the cumulative data volume transmitted during the flow.
9	Protocol	Identifies the protocol used for communication, such as TCP or UDP.
10	Anomaly score	Quantifies how much the traffic deviates from normal patterns.
11	Firmware version	indicates the specific firmware version of the IoT device.
12	Destination port	Shows the port number of the target device in the communication.
13	Packets in flow	Reflects the number of packets exchanged within the data flow.
14	Flow duration	Represents the total duration of the flow activity in seconds.
15	Rate of packets	indicates the number of packets transmitted per second.
16	Unique source count	Tracks the count of distinct source IPs within the flow.
17	Source IP	The originating IP address of the traffic.
18	Attack type	Specifies the nature of any detected malicious activity (e.g., SYN Flood).
19	Payload length	Represents the actual data content size within the packet.
20	Destination IP	Refers to the target IP address of the transmitted traffic.
21	Average packet size	Computes the mean size of packets within the flow.
22	Labels	Contains classification labels indicating the types of attacks observed.
23	Unique destination count	Indicates the variety of destination IPs encountered during the flow.
24	Target device	Identifies the specific IoT device targeted during an attack.

DOI: 10.7717/peerj-cs.3531/table-2

Class distribution: The dataset is very skewed, which is unfortunately typical of IoT systems. These attacks are relatively uncommon, with mixed-vector DDOS, HTTP flood, and ICMP flood being among the most unusual. Typical attacks on this group include regular traffic. Our solution was the Dynamic Weighted Balancing Strategy, also known as DWBS. To redirect the system’s attention and inspection efforts toward attack classes that are currently receiving insufficient attention, this method is being implemented.

Preprocessing steps: The training set was created by removing duplicate or broken flow entries and handling missing values in accordance with the guidelines provided by the first dataset. To standardize numerical data and alter categorical markers, the study used min-max scaling and one-hot encoding, respectively. Moreover, further filtering is performed using the Selective Hybrid Extraction (SHE) module to remove inconsistencies and preserve only the most essential features.

Train/Validation/Test split. To make it simpler to test a reliable model, the dataset was divided into three sections: 70% for training, 15% for validation, and 15% for testing. To ensure that each category had a representative number of unexpected attack types, a stratified sampling procedure was used.

Data preprocessing

Some preprocessing processes are used to make sure the dataset is clean, organized, and ready to be modeled (Mishra et al., 2020; Wang et al., 2024; Zhang et al., 2024b; Qiu et al., 2025). This procedure employs a collection of distinct methods, each tailored to deal with the dataset’s inherent heterogeneity and unique traits. Data normalization begins with bringing all continuous attributes’ scales into agreement on a single range, usually between 0 and 1. Having widely varying characteristic magnitudes aids convergence of numerous machine learning techniques. In order to normalize, the following equation must be obtained:

(1) $X^{*} = \frac{X - min (X)}{max (X) - min (X)}$ with $X^{*}$ standing for the scaled feature, X for the raw attribute value, and $min (X)$ and $max (X)$ for its smallest and maximum observed values, respectively. Once normalization is complete, any outliers are selected and eliminated. As the dataset includes traffic data with very high and very low values, it is necessary to detect and deal with outliers. We take a fresh tack by updating the IQR method with a dynamic scaling factor that changes with the feature distribution; this makes our technique extremely advanced. The equation for identifying outliers adjusted:

(2) $O u t l i e r = W \leq Q 1^{'} - m \cdot I Q R^{'} o r W \geq Q 3^{'} + m \cdot I Q R^{'} .$

The third quartile of the dataset is represented as $Q 3^{'}$ and the first quartile is $Q 1^{'}$ . To compute the interquartile range (IQR’), subtract Q3′ from Q1′ and use $m$ as a scaling factor to determine outlier identification sensitivity. The value of $m$ is fixed to 1.5 for standardization, but may be adjusted depending on dataset features. The next step involves managing the categorical variables ‘Device Type’ and ‘Operating System’. Frequency encoding substitutes categorical variables with their frequency instead of one-hot encoding. Use this equation:

(3) $W^{'} = \frac{g (W)}{k} .$

The transformed variable is represented by W′, the category count by $g (W)$ , and the total number of occurrences in the dataset is explicitly given by $k$ . Despite reduced dimensionality, each category remains significant. Next, identify the most informative aspects and remove the rest. To determine feature significance, recursive feature removal and correlation matrix are used, using the equation (Mishra et al., 2020):

(4) $F e a t u r e I m p o r t a n c e = \frac{\sum_{j = 1}^{m} | s_{j, k} |}{m}$ $s_{j, k}$ specifies the correlation coefficient between characteristics $j$ and $k$ , and $m$ indicates the total number of features. By quantifying the relative importance of each attribute in predicting the target label, this method improves and simplifies the model. We use a special preprocessing strategy to clean, organize, and optimize the dataset. This ensures that DDoS detection models perform effectively in Internet of Things scenarios. Every step of the process is improved in terms of accuracy, consistency, and reliability, and data issues are resolved using this strategy.

Advanced data preparation: balancing and feature optimization

The two crucial steps after initial data preparation are addressing the imbalance in the class distribution and prioritizing the features that will be most valuable for training the model (Susan & Kumar, 2021; Zhang et al., 2024a; Yue et al., 2025). Having said that, both of these things are vital. Due to underrepresentation of certain attack types in unbalanced datasets, inaccurate generalization and biased learning may result. Maintaining a customized approach throughout the prediction process is crucial for accurately representing all classes, including the ones with less frequent attack types. By extracting and selecting additional features, it is also possible to speed up and improve DDoS detection in IoT networks. To increase the feature space and balance the dataset, both methods are shown in Algorithm 1.

Algorithm 1 :

Advanced data preparation process.

1: Input: Raw dataset D, target labels L, features F

2: Output: Balanced dataset

D_{b a l a n c e d}

, optimized features

F_{o p t}

D_{p r o c e s s e d} \leftarrow P r e p r o c e s s (D)

F_{s e l e c t e d} \leftarrow F e a t u r e S e l e c t i o n (D_{p r o c e s s e d}, L)

5: for each class

Y_{k} \in L

v_{k} \leftarrow \frac{1}{\sqrt{| Y_{k} |}} \cdot \frac{M_{t o t a l}}{M_{c l a s s e s}}

7: end for

8: for each minority class

Y_{k}

9: while not enough samples in

Y_{k}

10: Select random instance

N_{m i n} \in Y_{k}

11: Select nearest neighbor

N_{n e a r} \in Y_{k}

12: Generate new sample

N_{n e w} \leftarrow N_{m i n} + β \cdot (N_{m i n} - N_{n e a r})

13: Add

N_{n e w}

to the dataset

D_{b a l a n c e d}

14: end while

15: end for

16:

F_{w e i g h t e d} \leftarrow S e l e c t i v e F e a t u r e W e i g h t i n g (F_{s e l e c t e d}, L)

17:

F_{o p t} \leftarrow N o n L i n e a r F e a t u r e E x t r a c t i o n (F_{w e i g h t e d})

18: return

D_{b a l a n c e d}, F_{o p t}

DOI: 10.7717/peerj-cs.3531/table-101

Dynamic weighted balancing strategy

To fix the dataset’s class imbalance, this study utilizes the DWBS. Limiting oversampling and undersampling is critical to prevent overfitting and data loss when employing DWBS. To mitigate class effects, it uses a more flexible approach throughout the resampling process. The calculated plan assigns more weight to infrequently occurring courses and less weight to frequently occurring ones. Selective sample synthesis is used. Doing so will allow the model to learn from any group without bias or self-repetition. A dynamic weighting formula is the basis of DWBS. The impact of the class on the dataset can be determined using the following formula, which is referred to as Class $C_{i}$ . It considers the makeup and size of the sample. Factors determining class weight:

(5) $ω_{i} = \frac{1}{\sqrt{s_{i}}} \cdot \frac{S_{t o t a l}}{S_{l a b e l s}} .$

The variables $ω_{i}$ and $s_{i}$ represent different aspects of the dataset. $S_{t o t a l}$ represents the total number of samples, and $S_{l a b e l s}$ represents the count of unique class labels. Without drastically altering the distribution, the square root function gives underrepresented groups more weight. This produces a smoothing effect. To further enhance the representation of underrepresented groups, synthetic examples are generated through controlled interpolation. Instead of only updating or duplicating the data points, this method does more. Data points that are more likely to be accurate are instead added by using the structure of the present class members. This synthetic sample was created using the following formula:

(6) $ξ_{n e w} = ξ_{a} + λ \cdot (ξ_{a} - ξ_{b}) .$

In this particular equation, the symbol $ξ_{n e w}$ represents the newly introduced synthetic data point. Item $x i_{b}$ is one of the most similar instances in the minority class to instance $x i_{a}$ , and there is another example from the minority class that is similar to it. It should be remembered that $λ$ may be any integer from 0 to 1. Because interpolation maintains the data space’s geometric structure, further samples will always be able to match the distribution. We are preparing the SHE module for training by refining it. This component helps prevent the slowdown of model learning by removing features that are noisy, useless, or of low quality. Both convergence and generalization are improved when the model is trained on the most informative characteristics. This is especially useful in complex IoT situations with high-dimensional, inconsistent data. Overall, DWBS improves the model’s capacity to classify rare but critical attack classes. By avoiding majority class tendencies from dominating the decision boundary, it promotes fairer and more balanced learning across all categories.

Feature optimization: selective hybrid extraction

Optimizing the feature set with data balance improves model performance. SHE is used to integrate feature selection and extraction into a single framework. SHE uses features’ statistical significance and non-linear data correlations.

In the first stage of SHE, known as Selective Feature Weighting (SFW), the relationship between each feature $H_{j}$ and the target labels determines the weighting of each feature. The equation is used to compute the weight of each feature, denoted as $Q_{j}$ .

(7) $Q_{j} = \frac{\sum_{m = 1}^{L} | s_{j m} |}{L} \cdot \frac{1}{1 + e x p (- δ \cdot Θ_{j})}$ $s_{j m}$ represents the correlation between feature $H_{j}$ and target label $T_{m}$ , L is the number of target labels, $δ$ is a tuning parameter, and $Θ_{j}$ is the variance of feature $H_{j}$ . This weight has been changed to account for the dataset’s linear correlations and feature variability.

In the second step of SHE, Non-linear Feature Extraction (NLFE) transforms a subset of features to capture higher-order interactions. NLFE maps chosen characteristics into higher-dimensional space using a kernel-based technique. Definition of transformation:

(8) $R_{j} = ϕ (H_{j}) = e x p (- γ \cdot ‖ H_{j} - H_{m e a n} ‖^{2})$ $ϕ (H_{j})$ is the non-linear transformation of feature $H_{j}$ , $γ$ is the kernel width, and $H_{m e a n}$ is the aggregate feature mean. This non-linear mapping reveals patterns that traditional approaches overlook.

DWBS is used for balancing, while SHE is used for feature optimization, balancing, and modifying the dataset to maintain the most helpful information for predictive modeling. These dataset-specific strategies improve model training and performance.

Classification: enhanced dual channel residual attention network

After preprocessing and optimising the dataset, a strong classification model called the EDCRAN is provided. To accurately identify DDOS attacks in IoT networks, this new model integrates the characteristics of EfficientNet (Navaneethakrishnan et al., 2024) and Vision Transformer (ViT) (Yin et al., 2022). A residual attention mechanism allows the EDCRAN’s two primary channels, the TAC and the SCC, to combine their outputs. The layered network is shown visually in Fig. 2.

Transform attention channel

Using transformer-based attention, the TAC can capture the long-range relationships within input feature maps. Patch embedding, multi-head self-attention, and a projection layer comprise this channel, enabling it to learn global context successfully.

Multi-Head Self-Attention and Patch Embedding: For computational efficiency, TAC flattens and projects the input feature map U into a lower-dimensional space after dividing it into non-overlapping patches. Every patch $U_{q}$ is combined in this way:

(9) $G_{q} = U_{q} \cdot B_{e}$ $G_{q}$ represents the embedded patch, while $B_{e}$ stands for a learnable weight matrix.

The model can grasp dependencies across the whole input space after processing the embedded patches using a multi-head self-attention layer. The process for determining the attention scores is done for every attention head $l$ .

(10) $M_{l} = s o f t m a x (\frac{P_{l} \cdot T_{l}^{T}}{\sqrt{z_{l}}})$ where $z_{l}$ is the dimensionality of the query/key and $P_{l}$ , $T_{l}$ , and $V_{l}$ are the corresponding matrices for the query, key, and value for the given head $l$ . Each head’s output is calculated as follows:

(11) $H_{l} = M_{l} \cdot V_{l} .$

To create the globally contextualized feature map $H_{T A C}$ , a projection layer is used to combine and send the final output from each attention head. Adjusting the dimensions to align with the subsequent layers is the job of this layer. This study examines the evolution of traffic patterns, which helps identify complex DDOS attacks.

Scaled convolutional channel

The SCC uses depthwise separable convolutions to capture localised, high-resolution information, drawing inspiration from EfficientNet. Batch normalization and scaling follow depthwise separable convolutional layers in this channel. Each layer of a depthwise separable convolution network uses a depthwise convolution and a pointwise convolution to scale the network. As shown below, this method ensures the security of spatial information by using a distinct filter for every input channel:

(12) $K_{S C C} = σ (D_{d} * U + D_{p})$ $D_{d}$ and $D_{p}$ represent depthwise and pointwise convolution weights, respectively. $*$ represents the convolution operation, and $σ$ is a non-linear activation function like ReLU.

At each level, a scaling factor, denoted as $η$ , is used to modify feature dimensions dynamically. A scaling definition is:

(13) $J_{s c a l e d} = η \cdot K_{S C C}$ $J_{s c a l e d}$ reflects the SCC’s scaled feature map output.

By capturing subtle, localized fluctuations, SCC enhances sensitivity to sudden volumetric spikes, which are typical of many-layered IoT-based attacks.

Residual attention fusion

A residual attention fusion module combines TAC ( $H_{T A C}$ ) and SCC ( $J_{s c a l e d}$ ) outputs. This technique enhances gradient flow, feature learning, and information retention by integrating both global and local characteristics. Definition of fusion:

(14) $R = θ \cdot (H_{T A C} + J_{s c a l e d}) + κ .$

Learnable parameters $θ$ and $κ$ govern the fusion weight. Finding a happy medium between static global data and dynamic local data becomes much easier with this. Leading the categorization is: A fully connected layer plus a softmax layer provide the combined representation of the class probabilities, denoted as R. The softmax function may be described as follows:

(15) $Q (y = c | R) = \frac{e x p (R_{c})}{\sum_{j} e x p (R_{j})} .$

Each class’s output is represented by $R_{j}$ , whereas class c’s probability is given by $Q (y = c | R)$ . Regarding the possibility of a DDOS attack, the classification head is the final location where information is disseminated. The EDCRAN paradigm employs a dual-channel configuration, as detailed in Algorithm 2. To extract all the characteristics, we utilize rapid convolutional processing and global attention. Improved data categorization accuracy is the outcome of this structure’s ability to understand complex patterns in data collected from IoT network traffic. With its speed and coverage, EDCRAN is an excellent choice for IoT network security applications that require handling large-scale data in real-time. Because TAC and SCC are also components of EDCRAN, this is indeed the case.

Algorithm 2 :

EDCRAN: enhanced dual-channel residual attention network.

1: Input: Raw input feature map U

2: Output: Predicted class probabilities

Q (y ∣ R)

3: Initialize: Projection weights

B_{e}

, depthwise and pointwise convolution kernels

D_{d}, D_{p}

, scaling factor

η

, fusion weights

θ

, and bias term

κ

4: Step 1: Temporal Attention Channel (TAC)

5: Divide the input feature map U into non-overlapping spatial-temporal patches

U_{q}

6: for each patch

U_{q}

7: Generate patch embedding using a linear projection:

G_{q} = U_{q} \cdot B_{e}

8: end for

9: Concatenate all embeddings to form a sequence:

G = [G_{1}, G_{2}, \dots, G_{n}]

10: for each attention head l do

11: Compute scaled dot-product attention scores:

M_{l} = s o f t m a x (\frac{P_{l} \cdot T_{l}^{T}}{\sqrt{z_{l}}})

12: Generate attended output:

H_{l} = M_{l} \cdot V_{l}

13: end for

14: Concatenate outputs from all heads and project to obtain TAC representation:

H_{T A C}

15: Step 2: Spatial Convolution Channel (SCC)

16: Apply depthwise separable convolution to extract spatial features:

17:

K_{S C C} = σ (D_{d} * U + D_{p})

18: Scale the SCC output to match TAC dynamics:

19:

J_{s c a l e d} = η \cdot K_{S C C}

20: Step 3: Residual Attention Fusion

21: Fuse the outputs from TAC and SCC channels using residual attention:

22:

R = θ \cdot (H_{T A C} + J_{s c a l e d}) + κ

23: Step 4: Classification Head

24: Pass the fused representation through a fully connected layer followed by a softmax function to produce class probabilities:

25:

Q (y = c ∣ R) = \frac{\exp (R_{c})}{\sum_{j} \exp (R_{j})}

26: return

Q (y ∣ R)

DOI: 10.7717/peerj-cs.3531/table-102

Hyperparameter tuning with GJO

The GJO technique is used to optimize the hyperparameters of the EDCRAN model. Based on the golden jackal’s cooperative hunting, GJO efficiently explores and exploits the hyperparameter space. The optimization of the hyperparameters is represented by $V_{j}^{(t + 1)}$ and the optimal solution is denoted by $V_{b e s t}$ . The candidate who is performing is responsible for controlling the tracking intensity with $γ$ , adjusting the attacking phase with $δ$ , and increasing search space coverage and diversifying exploration with $Λ$ .

(16) $V_{j}^{(t + 1)} = V_{b e s t} - γ \cdot | V_{j}^{(t)} - V_{b e s t} | + δ \cdot Λ .$

The optimization of the hyperparameters is represented by $V_{j}^{(t + 1)}$ and the optimal solution is denoted by $V_{b e s t}$ . The candidate who is performing is responsible for controlling the tracking intensity with $γ$ , adjusting the attacking phase with $δ$ , and increasing search space coverage and diversifying exploration with $Λ$ . GJO adaptively controls learning rate, batch size, and regularization in the EDCRAN model. Increasing the model’s exploration and exploitation improves its performance and helps it find ideal hyperparameters. GJO, which optimizes model hyperparameters using HPC servers, only works offline. By implementing these changes, we can ensure that the EDCRAN model works effectively before its implementation. The final model and parameters are sent to IoT edge devices after training. Inference does not need optimization or convergence overhead when everything is clear. Due to its minimal and efficient runtime, this method is suitable for resource-constrained environments. GJO’s quick convergence and small CPU footprint enable low-power Internet of Things devices to run the EDCRAN model in real-time without compromising performance.

Performance evaluation metrics

The Attack Complexity Weighted Score (ACWS) and Contextual Prediction Stability (CPS) are two new measures introduced in this study. Along with the more common metrics, such as recall, accuracy, precision, F1-score, and area under the curve (AUC), others are also included. To detect DDoS attacks in an IoT setting, these metrics are designed to track key elements of a model’s behavior. To sustain this conversation, forecasts must be robust against attacks of varying degrees of complexity and maintain their accuracy over time. Checking the accuracy of the model’s predictions over time is made easy using CPS. These metrics are used to measure the model’s reliability. When examining the anticipated class probabilities, CPS seeks volatility; lower volatility indicates greater stability over time. The standard approach primarily considers pointwise correctness; this is distinct from it. The definition of the metric is as follows:

(17) $C P S = 1 - \frac{1}{N - W + 1} \sum_{t = W}^{N} \frac{| P_{t} - P_{t - 1} |}{2} .$

The predicted probability at time $t$ is shown by the variable $P_{t}$ , the total number of time points is shown by the variable N, and the size of the window is shown by the variable W. The average flow time observed in the dataset over the whole method is used to determine the value of W for calibration, ensuring that temporal segments consistently display the correct traffic intervals. Although it is linked to smoothness metrics used in sequence modeling, CPS is especially developed to work for probability-based intrusion detection.

Another novel measure that demonstrates the model’s ability to detect attacks of varying degrees of complexity is the AACWS. Because specific attacks are uncommon, highly dynamic, or distributed across multiple protocol levels, it may be challenging to detect them. Each type of assault is assigned a complexity weight, denoted by $w_{j}$ , to account for these variations. This complexity weight is dependent on a wide range of variables, including anomaly score dispersion, payload variability, and rarity. ACWS is expressed as:

(18) $A C W S = \frac{\sum_{j = 1}^{K} w_{j} \cdot {T P}_{j}}{\sum_{j = 1}^{K} w_{j} \cdot ({T P}_{j} + {F P}_{j})} .$ The accurate results (TPj) and incorrect results (FPj) variables stand in for the relevant data in class j. All of the weights are set up so that they equal one when summed together. This weighting technique strengthens the impact of improved recognition of complex or low-frequency assaults on the final score. Because of their greater stealth, the second and third forms of attacks are much less prevalent.

To prove the efficacy of the measures, the CPS was compared to the moving-window F1 and temporal AUC, and the ACWS was compared to the weighted F1 and balanced accuracy measurement. ACWS provided clearer information on how well individuals fared in infrequent, high-impact assaults, which traditional measures often overlook. CPS was more sensitive to short-term prediction instability. CPS and ACWS demonstrate how the concept works in real-life applications utilizing the Internet of Things. Attack complexity and temporal consistency are crucial for intrusion detection in these cases. They enhance accuracy metrics and reveal model behavior.

Model deployment feasibility and practicality in IoT environments

Crucially, a high-performance server that is not linked to the internet executes the whole model pipeline, which consists of preprocessing, dynamic balancing using DWBS, feature selection with SHE, and training. Our goal is to demonstrate how user-friendly EDCRAN would be in practical Internet of Things scenarios. In contrast, IoT devices do not perform these resource-intensive activities. Nodes in the network’s periphery get only the model’s little inference module after it has undergone exhaustive training and optimization.

The architecture of EDCRAN has been designed to be efficient, allowing it to provide fast forecasts while using minimal memory. Using depthwise separable convolutions, the SCC differs from the TAC’s compact attention approaches. This two-step process involves training the model offline and then deploying it live to ensure that IoT devices remain responsive and use energy efficiently. Furthermore, it enables the identification of DDoS attacks in real-time, eliminating the need for expensive computing budgets.

Simulation results

Experimental setup and dataset description

The experiments were conducted on an Intel Core i7 workstation with 16 GB of RAM, utilising Python for model creation and assessment, with the aid of Scikit-learn, TensorFlow, and PyTorch. Automated hyperparameters for EDCRAN were chosen using the Golden Jackal Optimisation (GJO) technique. With its variety of DDoS attack behaviours and actual IoT traffic, the IN-DDoS24 dataset was used for training and validation. To prevent any potential impact from future traffic patterns on the training process, all flow data were first organised chronologically according to their timestamps. This was done before the train/validation/test divides were created. The complete architecture and parameters, layers along with values are shown in Table 3.

Table 3:

Complete architecture and training specifications of the proposed EDCRAN model.

Component	Specification	Value
Input and Encoding
Input shape	(Time steps $\times$ features)	$128 \times 24$
Feature projection	Linear(24 $\times$ 64)	64-d embedding
Scaled Convolutional Channel (SCC)
SCC Block 1	DWConv(k = 5) + PWConv(64 $\to$ 64)	SiLU, Dropout 0.2
SCC Block 2	DWConv(k = 5) + PWConv(64 $\to$ 96)	SiLU, Dropout 0.2
SCC Block 3	DWConv(k = 3) + PWConv(96 $\to$ 128)	SiLU, Dropout 0.2
Pooling	Global Average Pooling	$128$ -d output
Transform Attention Channel (TAC)
Patch size	Temporal patches	8
Tokens	Sequence length/patch size	16 tokens
Embedding dimension	$d_{m o d e l}$	128
Transformer layers	Encoder blocks	3
Attention heads	Multi-head attention	4
FFN dimension	Feedforward network	256
Activations	Attention: Softmax; FFN: GELU	–
Pooling	Mean over tokens	$128$ -d output
Fusion and Classification head
Fusion mechanism	Residual gated fusion	Learnable $θ, κ$
Dense layer 1	128 $\to$ 96	SiLU + Dropout 0.2
Dense layer 2	96 $\to$ C classes	Softmax
Training configuration
Optimizer	AdamW (GJO-tuned)	–
Learning rate	Selected by GJO	$1 \times 10^{- 2}$
Batch size	–	96
Epochs	–	30
Weight decay	–	$1 \times 10^{- 4}$
Loss function	Cross-entropy + DWBS weighting	–
Regularization	Dropout + Gradient Clipping (1.0)	–
Early stopping	Patience = 8 (val. AUC)	–
Precision mode	FP16 mixed precision	–
Model complexity
Total trainable parameters	–	1.18 M
Total FLOPs (per sample)	–	0.62 G
Inference suitability	Resource-constrained IoT/edge devices	Yes

DOI: 10.7717/peerj-cs.3531/table-3

The accuracy, F1-score, AUC, CPS, and ACWS values shown here represent the mean across all pipeline runs, with minimal variation, and were used to evaluate the model’s stability. The initialisation seeds used were five separate random ones. After that, the dataset was cleaned, normalised, and class-balanced using DWBS. Then, features were refined using the SHE module to build an input space that was consistent and of high quality. We evaluated EDCRAN’s performance with many baseline intrusion detection models to assess its accuracy, processing efficiency, and suitability for IoT contexts with limited resources. Moreover, it further discusses in depth the design and training setup for repeatability in the parts that follow.

Edge efficiency analysis

To support the lightweight-deployment claim, the computational behaviour of EDCRAN was evaluated using its architecture profile and measured runtime performance. With 1.18 M trainable parameters and approximately 0.62 GFLOPs per forward pass, the model requires around 4.7 MB of memory in FP32 precision (and about 2.4 MB when using FP16). On the workstation used for experimentation (Intel Core i7, 16 GB RAM), the model achieves an average inference time of 2–3 ms per sample. This proves that EDCRAN can keep its computational cost and memory footprint low. This makes it a good fit for resource-constrained Internet of Things environments that rely heavily on real-time detection. With a variation of $\pm$ 0.3 milliseconds, the average inference latency across all five experiments was 2.8 milliseconds. To determine whether the data was consistent over time, we consulted this data. For real-time Internet of Things (IoT) deployments, EDCRAN’s constant timing characteristic is crucial because to its relative stability, especially in situations when network requirements are subject to change. Its minimal variance is evidence of this.

Attack analysis and traffic distribution

The various forms and distribution of DDoS assaults are shown in Fig. 3. In terms of frequency, the next most prevalent kinds of traffic are application layer attacks (12.8%), UDP flood (19.1%), SYN flood (25.5%), and regular traffic (42.6%). Using data balancing strategies during model training is essential to avoid bias in favor of majority classes. The need for this differentiation becomes clear when one considers how real networks operate. Consider the frequency of certain types of attacks while designing intrusion detection systems. Because of this, it is possible to take focused actions. This manner, both typical and out-of-the-ordinary dangers may be identified.

Figure 3: Frequency distribution of attack types.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-3

Figure 4 illustrates DDoS assaults on Linux, Windows, and RTOS operating systems. Most attacks target Linux, Windows, and RTOS. Linux has been attacked 100,000 times, Windows 80,000. Linux is open-source and widely utilized in industrial systems, thus attackers target it. Hackers target Linux, as seen by these statistics. Linux powers many IoT devices. The analysis demonstrates that Linux installation settings need custom security measures. Linux’s flexibility and customisation might make users more susceptible. Information security experts may use this data to understand OS-specific hardening tactics, fix important security gaps, and protect Internet of Things ecosystems from new attacks.

Figure 4: DDoS attack patterns by operating system type.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-4

Network protocols (TCP, UDP, ICMP) are distributed in both regular and attack traffic, as seen in Fig. 5. With 150,000 everyday occurrences and 80,000 attacks, TCP traffic is the most prevalent. The prevalence of IoT devices is suggested. Then comes UDP with 120,000 normal and 60,000 attack events. Less-checked real-time apps utilize it. The lowest diagnostic distribution is ICMP, with 50,000 normal and 20,000 attack events. These numbers illustrate how protocols operate within a network. We need protocol-specific defenses since TCP might be targeted. Databases are susceptible to SYN Flood attacks; however, they are protected by TCP handshakes. For average anomaly scores across different attacks, refer to Fig. 6. A score indicates weird behavior. Typical traffic has an irregularity value of 0.1, whereas App Layer Attack has the highest score of 0.9. The App Layer Attack sees the most surprising behavior. It appears that there are distinct patterns of traffic. The anomaly ratings for SYN Flood and UDP Flood are 0.85 and 0.7, respectively. This is because, despite their diversity, they lack the sophistication of malicious actors operating at the application layer. The results show the significance of anomaly detection systems. They may distinguish the two types of traffic. System sensitivity and attack miss rate are both improved by increasing high-anomaly attack detection settings. This figure help adaptive detection systems find DDoS problems.

Figure 5: Protocol distribution in normal vs. attack traffic.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-5

Figure 6: Average anomaly score by attack type.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-6

Protocol behavior and temporal trends

The distribution of attack types across the year is shown graphically in Fig. 7. This illustrates how types impact network traffic and security. Normal traffic dominates all months, as seen by the stacked area figure. In contrast, SYN Flood, UDP Flood, and Application Layer Attack exhibit various patterns across time. During seasonal or operational changes, preventive and adaptive measures must be taken to protect oneself. EDCRAN works effectively with IN-DDoS24, according to this study. It must be compatible with bigger and more diverse IoT networks. We will use datasets with varying network sizes, traffic patterns, and threats to demonstrate EDCRAN’s ability to adapt to dynamic Internet of Things circumstances.

Figure 7: Proportion of attack types over time.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-7

The data in Fig. 8 shows that out of all the traffic, 57.1% is carried by TCP, 35.7% is by UDP, and 7.1% is via ICMP. The widespread use of TCP demonstrates its dependability and suitability for connection-oriented communication. One indication that UDP excels in real-time applications is its widespread use. Diagnostic or flood attacks may still cause damage, even when ICMP attacks are infrequent. This research suggests that EDCRAN’s ability to generalize across protocols is a strong suit; however, to prove its robustness, it is necessary to use a wider variety of traffic scenarios and larger networks. To evaluate EDCRAN’s performance in actual Internet of Things environments, this approach will be used in subsequent experiments.

Figure 8: Protocol distribution in total traffic.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-8

Figure 9 shows that 7.1% of DDoS traffic is ICMP, 35.7% is UDP, and 57.10% is TCP. DDoS assaults often use the TCP. TCP might be targeted by a SYN Flood, a connection-based assault, due to its prevalence. Data may overflow as UDP does not need a state. However, less common uses, such as reconnaissance and targeted flooding, utilize the Internet Control Message Protocol (ICMP). For this reason, adaptive systems and protocol-aware detection are crucial. As seen in Fig. 10, TCP sessions, which are primarily concerned with connections, tend to last longer than UDP sessions (30.2% of the time) and ICMP sessions (15.8% of the time). The reason is that a TCP session only lasts for fifty-five seconds. These enhancements have made it clear that flow time patterns are crucial for identifying DDoS attack types targeting specific protocols.

Figure 9: Analysis of protocol usage patterns in DDoS attacks.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-9

Figure 10: Protocol vs. average flow duration.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-10

Feature importance and anomaly detection

Figure 11 shows the distribution of anomaly ratings across different network protocols. These stages are shown in the figure. The SYN Flood and ACK Flood attacks exploit the fact that TCP can handle a wide range of anomaly scores. A security issue arises with the TCP, which is a connection-based protocol. UDP is susceptible to attacks such as UDP Floods because of its lack of connection-oriented features. In comparison to other algorithms, UDP has a smaller range of anomaly scores. Due to its primary use in network diagnostics, ICMP’s anomaly ratings are infrequently updated. In light of these results, it appears that protocol-specific anomaly detection systems are crucial for identifying and defending against targeted attacks.

Figure 11: Protocol-specific anomaly score distribution.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-11

A stacked histogram showing changes in payload length with various attacks is shown in Fig. 12. Standard payload sizes for IoT network communication are 64 and 128 bytes. The 512-byte and 1,024-byte payloads are used by SYN and UDP Flood attack techniques, respectively. There are numerous ways that application-layer attacks can evade detection systems, even if their payloads differ. Unusual payload sizes may be uncovered via the discovery of anomalies. Reduce false positives and aid in detecting high-frequency attacks by utilizing payload-specific characteristics. The importance of attack-specific detection is further highlighted. It would lead to improved DDoS protection. By analyzing the relationship between packet rate, payload length, and anomaly scores, Figs. 13 and 14 illustrate how traffic responds to various attacks. Application Layer Attacks may cause system overload as a consequence of increasing traffic, as seen in Fig. 13. Under UDP Flood attacks, a 1024-byte payload is effective. These differences underscore the importance of traffic pattern analysis when developing systems to detect various types of attacks. Figure 14 displays a three-dimensional scatter plot illustrating a correlation between higher anomaly scores and faster packet speeds and payloads. The UDP Flood and the Application Layer Attack are two examples of aggressive attacks. The fact that benign traffic is more evenly distributed suggests that the model can differentiate between benign and harmful actions. In multidimensional traffic, they enhance anomaly detection systems. Systems can proactively handle intricate and ever-changing DDoS patterns.

Figure 12: Distribution of payload lengths by attack type.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-12

Figure 13: Average rate of packets by payload length and attack type.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-13

Figure 14: Correlation between payload length, rate of packets, and anomaly score.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-14

Classification performance and statistical analysis

A packet rate density map for different assault strategies is given in Fig. 15. This graphic shows how attacks affect network data flow. Density peaks occur when packet rates increase due to SYN Flood and UDP Flood attacks, overwhelming the target with traffic and depleting its resources. Intrusion detection systems identify and classify this focused activity. Regular traffic is continuous and well-distributed; we can use it as a baseline to identify outliers. These patterns help IDSs by changing packet-rate thresholds for early detection and lowering false positives.

Figure 15: Rate of packets across attack types.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-15

The correlation matrix for certain selected features is shown in Fig. 16. Predictive modeling and its implications are shown in this matrix. The speed of packets, flow duration, payload length, and anomaly score can help detect network attacks. Adding related variables emphasizes key qualities and enhances model prediction. Source Port and Anomaly Score are weak connections that reveal the data isn’t helpful and simplify it. A correlation matrix simplifies feature selection, making it easier to detect DDoS attacks and other network issues. As shown in Fig. 17, the DDoS prediction model prioritizes the importance of each feature. Since they match attacker traffic, the Anomaly Score and Flow Duration may also detect attacks. To learn about atypical traffic patterns, the model may utilize all of these attributes and then select the most relevant one. Additionally, Source Port and Source IP may not have a significant impact. Optimizing processing resources by prioritizing essential variables improves model accuracy and data handling. The confusion matrix used for categorizing different types of DDoS attacks is shown in Fig. 18. The algorithm ensures that the rates of false positives and false negatives for each category remain below 3, enabling it to distinguish between dangerous and regular traffic quickly. Minimizing instances of misclassification becomes crucial when precise detection is necessary to maintain network security in real-world scenarios. The model’s remarkable classification performance shows this.

Figure 16: Correlation between features.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-16

Figure 17: Feature importance ranking by SHE.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-17

Figure 18: Confusion matrix demonstrating classification performance.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-18

The model is able to distinguish between good and poor data in real-time Internet of Things scenarios, as shown in Fig. 19. The model seems to be successful against a wide variety of assaults, according to the AUC ratings, which fall within the range of 0.978 to 0.995. The model’s ability to handle complicated and high-dimensional patterns is shown by its maximum AUC, which was obtained during the Application Layer Attack. The results show that the model’s architecture is strong; it detects risks by using advanced feature selection and optimization techniques. Further, they facilitate detection to numerous evolving IoT scenarios.

Figure 19: ROC curve for DDoS attack types.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-19

After thirty epochs of training and testing, the EDCRAN model’s accuracy is shown in Fig. 20, with convergence being achieved by the twenty-fourth epoch. The successful learning is demonstrated by the constant improvement in accuracy during the early epochs, while the robust generalisation without overfitting is indicated by the end accuracy of 98.99 percent. The training and testing losses are significantly reduced in the first few epochs, as seen in Fig. 21, but they stabilise after the 24th epoch. The model’s ability to learn and reduce its error rate is made abundantly obvious by this discovery. It seems that EDCRAN can reliably and properly identify complex DDoS patterns in real-world circumstances, according to the conservative estimates of final loss.

Figure 20: Accuracy in training and tests of EDCRAN.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-20

Figure 21: Loss in training and tests of EDCRAN.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-21

The comparison of DDoS detection systems’ accuracy, precision, and AUC is presented in Table 4. Proposed EDCRAN employs a unique dual-channel residual attention architecture to focus on relevant features and minimise input noise, thereby enhancing performance. The model distinguishes between attacks and non-attacks with an AUC of 99.5% and an accuracy of 98.9%, yielding minimal false positives. Dropout regularisation and feature selection improved EDCRAN performance. It helps models to operate effectively with imbalanced and noisy datasets. Attention layers and residual connections enable EDCRAN to assimilate and display complex DDoS attack patterns. EDCRAN’s CNN and DBN outperform Adaptive Transfer Learning for advanced, attention-free DDoS detection. Complex and resource-intensive, GNN Ensemble Learning and MMEDRL-ADM improve detection with ensemble and metaheuristic tuning. MSCBL and CNN-LSTM-Transformer effectively handle time- and space-varying data. It employs static feature extraction, which is unable to handle EDCRAN and other attacks. EDCRAN is reliable and scalable, as it can detect multiple DDoS attacks. Complex data linkages and large, high-dimensional datasets pose challenges for SVM and Decision Trees. Rapid EDCRAN threat detection makes networks safer and more robust to growing attacks.

Table 4:

Results from different approaches analysed comparatively.

Techniques	F1-score (%)	Log loss	CPS (%)	ACWS (%)	Accuracy (%)	AUC (%)	Recall (%)	Precision (%)
Adaptive transfer learning with CNN (Anley et al., 2024)	91.4	0.210	85.0	83.9	93.2	92.7	91.8	92.3
Deep Belief Networks (DBN) (Wanda & Hiswati, 2024)	90.7	0.220	83.8	82.6	92.4	91.8	90.9	91.5
CNN-LSTM-transformer (Ouhssini et al., 2024)	92.8	0.195	86.1	84.5	94.0	93.4	92.7	93.0
Correlation-aware LSTM and transformers (Hekmati et al., 2024)	92.1	0.202	85.4	83.8	93.5	92.9	92.1	92.6
CNN with Reciprocal Points Learning (RPL) (Shieh et al., 2024)	94.2	0.175	88.0	86.5	95.4	94.8	94.1	94.6
Fuzzy Q-Learning with Chebyshev (Kumar, Dutta & Pranav, 2024)	91.0	0.208	84.2	83.0	93.0	92.4	91.3	91.9
Adaptive Federated Learning (FLAD) (Doriguzzi-Corin & Siracusa, 2024)	93.0	0.185	86.8	85.7	94.3	93.7	93.1	93.5
Lightweight CNN-LSTM models (Salih & Abdulrazaq, 2023)	93.8	0.180	87.0	86.2	94.8	94.1	93.6	94.0
DDoSViT (Vision Transformer) (Ali et al., 2025)	92.0	0.200	85.9	84.4	93.5	94.2	92.3	92.8
GNN ensemble learning (Bakar et al., 2024)	94.6	0.170	88.3	87.0	95.6	95.0	94.5	94.9
MMEDRL-ADM (Paidipati, Kurangi & Uthayakumar, 2024)	95.0	0.165	89.1	87.5	96.0	95.5	95.0	95.3
Encoder-Stacked DNN (E-SDNN) (Benmohamed, Thaljaoui & Elkhediri, 2024)	94.9	0.172	88.7	87.2	95.8	95.2	94.8	95.1
Federated learning for IoT (Alhasawi & Alghamdi, 2024)	93.3	0.182	86.5	85.8	94.5	93.9	93.2	93.7
Multi-Scale CNN-BiLSTM (MSCBL) (Yin, Fang & Liu, 2024)	93.5	0.180	86.8	85.9	94.6	94.0	93.5	94.0
Embedded classification for IoT traffic (Shukla, Krishna & Patil, 2024)	92.7	0.190	85.5	84.2	93.7	93.1	92.6	93.1
TCN-Based Detection (5G IoT) (Akhi, Eising & Dhirani, 2025)	91.6	0.205	84.7	83.5	93.2	92.9	91.7	92.2
Proposed EDCRAN	98.7	0.055	97.8	96.5	98.9	99.5	98.9	99.2

DOI: 10.7717/peerj-cs.3531/table-4

Table 5 displays statistical analysis of execution durations for DDoS attack detection models, including Adaptive Transfer Learning with CNN, DBN, CNN-LSTM-Transformer, Correlation-Aware LSTM and Transformers, and EDCRAN. Statistics such as mean, median, minimum, maximum, range, and standard deviation are used to evaluate a model’s computational efficiency and stability. The suggested EDCRAN is computationally efficient and stable because of its short mean execution time and low variability. EDCRAN achieves high detection accuracy and low computational overhead due to its streamlined dual-channel architecture and enhanced residual attention mechanism. Its lightweight design is well-suited for real-time IoT applications that require rapid response times. Regular models, such as Adaptive Transfer Learning with CNN and CNN-LSTM-Transformer, take longer and are more unpredictable. Their sophisticated designs and lack of real-time performance adjustment may explain this. These models are helpful in certain situations but struggle to balance computing load with forecast accuracy. When compared to more complex models, Decision Trees and SVMs are easier to operate, but their lack of accuracy and durability makes them unsuitable for complex real-time detection applications. EDCRAN’s ability to combine computational economy with detection reliability makes it a promising candidate for scalable and time-sensitive IoT applications, outperforming both complicated and straightforward methods.

Table 5:

Statistical analysis of proposed and existing models execution time.

Model	Range (s)	Median (s)	Mean (s)	Max (s)	STD (s)	Min (s)
Adaptive transfer learning with CNN (Anley et al., 2024)	20	73	72	83	5.15	63
Deep Belief Networks (DBN) (Wanda & Hiswati, 2024)	18	72	71	80	4.85	62
CNN-LSTM-Transformer (Ouhssini et al., 2024)	17	70	70	78	4.70	61
Correlation-Aware LSTM and transformers (Hekmati et al., 2024)	19	71	71	81	5.01	62
CNN with Reciprocal Points Learning (RPL) (Shieh et al., 2024)	19	69	70	79	4.89	60
Fuzzy Q-Learning with Chebyshev (Kumar, Dutta & Pranav, 2024)	16	70	69	77	4.60	61
Adaptive Federated Learning (FLAD) (Doriguzzi-Corin & Siracusa, 2024)	16	68	68	76	4.55	60
Lightweight CNN-LSTM models (Salih & Abdulrazaq, 2023)	16	69	68	77	4.58	61
DDoSViT (Vision Transformer) (Ali et al., 2025)	18	70	69	78	4.80	60
GNN ensemble learning (Bakar et al., 2024)	15	67	66	74	4.18	59
MMEDRL-ADM with metaheuristics (Paidipati, Kurangi & Uthayakumar, 2024)	14	66	65	73	4.10	58
Encoder-Stacked DNN (E-SDNN) (Benmohamed, Thaljaoui & Elkhediri, 2024)	14	65	65	72	4.05	58
Federated CNN for decentralized detection (Alhasawi & Alghamdi, 2024)	15	68	67	76	4.30	60
Multi-Scale CNN-BiLSTM (Yin, Fang & Liu, 2024)	14	67	66	75	4.25	59
Embedded IoT traffic classifier (Shukla, Krishna & Patil, 2024)	15	66	65	74	4.20	59
TCN-based detection (Akhi, Eising & Dhirani, 2025)	15	66	66	75	4.25	60
Proposed EDCRAN	5	22	22	25	1.85	20

DOI: 10.7717/peerj-cs.3531/table-5

Resource efficiency and scalability

In Fig. 22, Precision, Recall, and F1-score values are compared for five methods: Adaptive Transfer Learning with CNN, DBN, CNN-LSTM-Transformer, GNN Ensemble Learning, and Proposed EDCRAN, across dataset sizes from 20,000 to 250,000 records. The proposed EDCRAN surpasses all previous approaches, with Precision, Recall, and F1-score improving considerably with dataset size. EDCRAN has the highest F1-score of 0.9875 at 250,000 records, demonstrating its capacity to generalize in larger and more complex datasets. EDCRAN’s dual-channel residual attention architecture combines global and local information for precise classification, yielding improved performance. Advanced preprocessing methods, such as DWBS and SHE, enhance minority attack class recall by addressing unbalanced datasets. Optimised feature selection reduces overfitting, improving accuracy and F1-score. Other approaches work well with smaller datasets but struggle with larger ones due to issues with feature extraction and imbalance management. DBN and CNN-LSTM-Transformer have poorer recall rates, suggesting difficulty in collecting complicated attack patterns from large datasets. These findings demonstrate EDCRAN’s scalability and efficiency in IoT network intrusion detection.

Figure 22: Classification metrics vs. dataset size for all methods.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-22

Figure 23 compares processing time for several approaches, including Adaptive Transfer Learning, Deep Belief Networks, CNN-LSTM-Transformer, GNN Ensemble Learning, and Proposed EDCRAN, with increasing dataset sizes. The EDCRAN has the lowest processing times across all dataset sizes and scales well with bigger datasets. EDCRAN processes 250,000 records in 50 s, whereas GNN Ensemble Learning and CNN-LSTM-Transformer take 105 and 100 s. Due to its lightweight dual-channel residual attention technique, which balances computational efficiency and feature extraction, EDCRAN processes data more efficiently. EDCRAN removes redundant features via SHE, reducing computing overhead. Its optimised hyperparameter tuning utilising Golden Jackal Optimisation (GJO) speeds convergence, increasing efficiency. Due to their ensemble methodologies and sequential processing architectures, GNN Ensemble Learning and CNN-LSTM-Transformer take longer to process. Large, unbalanced datasets are inefficiently processed using these approaches without adaptive pretreatment, such as DWBS. This chart shows EDCRAN’s applicability for real-time IoT with rapid and reliable intrusion detection.

Figure 23: Processing time vs. dataset size.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-23

As the dataset grows, Fig. 24 displays CPU and memory use for each strategy. The least resource-intensive approach is EDCRAN, with 42% CPU and 2.0 GB RAM for 250,000 entries. CNN-LSTM-Transformer and GNN Ensemble Learning require over 85% CPU and 7.8 GB RAM. By employing lightweight convolutional layers and attention approaches, EDCRAN’s scalable method simplifies computation. This maximizes resources. EDCRAN’s TAC and SCC handle global and local trends without further processing. The DWBS reduces rebalancing calculations, saving resources. Since they utilize ensemble techniques and deep recurrent structures, GNN Ensemble Learning and CNN-LSTM-Transformer consume more resources. EDCRAN is beneficial in resource-constrained settings like edge-based IoT setups where speed is crucial.

Figure 24: Resource usage (CPU and Memory) vs. dataset size.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-24

The accuracy of various techniques, taking into account the expanding dataset, is shown in Fig. 25. EDCRAN beats competitors with 98.9% accuracy for 250,000 records. However, GNN Ensemble Learning scores 95.6%. CNN-LSTM-Transformer and CNN-Adaptive Transfer Learning only get 94.0% and 93.2%. EDCRAN yields accurate results because to its dual-channel design and sophisticated preprocessing pipeline. The model finds important patterns by reducing noise and unnecessary features using feature dropout and Selective Hybrid Extraction (SHE). The CPS and ACWS measurements assess the model’s ability to handle complicated attack patterns. The lack of dynamic balancing and constrained feature selection in Adaptive Transfer Learning with CNN and DBN may explain its low performance. EDCRAN can detect several complete Internet of Things datasets, as shown.

Figure 25: Accuracy vs. dataset size for all methods.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-25

The new EDCRAN model outperforms all other models on the IN-DDoS24 and CIC-DDoS2019 datasets, as shown in the comparison in Table 6. With 98.9% accuracy and 99.5% AUC on IN-DDoS24 and 97.6% accuracy and 98.2% AUC on CIC-DDoS2019, EDCRAN routinely surpasses existing CNN-LSTM hybrids, graph-based models, and transformer-driven approaches. These benefits show the effectiveness of EDCRAN’s dual-channel residual attention design. In many IoT contexts, it improves the reliability of DDoS attack detection and captures more complicated traffic patterns.

Table 6:

Head-to-head comparison of proposed EDCRAN and baseline models on IN-DDoS24 and CIC-DDoS2019.

Model	IN-DDoS24 dataset		CIC-DDoS2019 dataset
	Accuracy (%)	AUC (%)	Accuracy (%)	AUC (%)
Adaptive transfer learning with CNN (Anley et al., 2024)	93.2	92.7	92.1	91.5
Deep Belief Networks (DBN) (Wanda & Hiswati, 2024)	92.4	91.8	91.2	90.5
CNN-LSTM-Transformer (Ouhssini et al., 2024)	94.0	93.4	93.1	92.6
Correlation-Aware LSTM + Transformer (Hekmati et al., 2024)	93.5	92.9	92.4	91.8
CNN + Reciprocal Points Learning (RPL) (Shieh et al., 2024)	95.4	94.8	94.3	93.9
Fuzzy Q-Learning + Chebyshev (Kumar, Dutta & Pranav, 2024)	93.0	92.4	92.0	91.0
Adaptive Federated Learning (FLAD) (Doriguzzi-Corin & Siracusa, 2024)	94.3	93.7	93.5	92.9
Lightweight CNN-LSTM (Salih & Abdulrazaq, 2023)	94.8	94.1	93.9	93.2
DDoSViT (Vision Transformer) (Ali et al., 2025)	93.5	94.2	92.8	93.0
GNN ensemble learning (Bakar et al., 2024)	95.6	95.0	94.8	94.0
MMEDRL-ADM (Paidipati, Kurangi & Uthayakumar, 2024)	96.0	95.5	95.1	94.9
Encoder-Stacked DNN (E-SDNN) (Benmohamed, Thaljaoui & Elkhediri, 2024)	95.8	95.2	94.7	94.1
Federated CNN for IoT (Alhasawi & Alghamdi, 2024)	94.5	93.9	93.2	92.7
Multi-Scale CNN-BiLSTM (Yin, Fang & Liu, 2024)	94.6	94.0	93.7	93.1
Embedded IoT traffic classifier (Shukla, Krishna & Patil, 2024)	93.7	93.1	92.8	91.9
TCN-Based detection (Akhi, Eising & Dhirani, 2025)	93.2	92.9	92.0	91.3
GraphFedAI (Fed-LSTM-Attention) (Malik, Farooq & Raza, 2025)	94.9	94.4	94.1	93.6
LSTM-JSO (LSTM + Jellyfish Search Optimization) (Sorour et al., 2025)	95.3	94.9	94.5	94.0
Hybrid LSTM-CNN OK (Al-Hasani, Abdelmaksoud & Rezk, 2026)	95.7	95.1	95.0	94.3
Proposed EDCRAN (Full model)	98.9	99.5	97.6	98.2

DOI: 10.7717/peerj-cs.3531/table-6

An ablation study assessing the key components of the EDCRAN architecture is shown in Table 7. Removing the TAC or SCC reduces accuracy and AUC, indicating that heterogeneous IoT traffic requires global contextual modeling and targeted spatial pattern extraction. EDCRAN, SMOTE, and class reweighting were evaluated to assess the Dynamic Weighted Balancing Strategy. Although less accurate, F1-score, and CPS than DWBS, these techniques fared better than no balancing mechanism. DWBS outperforms synthetic oversampling and static reweighting for rapidly changing, highly skewed IoT attack distributions. Similar trends were seen using Chi-square–based feature selection and SHE. Due to EDCRAN’s nonlinear feature refinement, duplication was reduced and discrimination improved, but Chi-square variation underperformed SHE. Finally, eliminating the Golden Jackal Optimization (GJO) module increased execution time and lowered predictive performance, proving hyperparameter tuning is necessary for accuracy and stability. A full EDCRAN configuration has maximum accuracy, F1-score, and AUC and competitive execution time. Each component impacts performance and produces a robust, well-balanced intrusion detection architecture.

Table 7:

Ablation study including comparisons with simpler alternatives.

Model variant	Accuracy (%)	F1-score (%)	AUC (%)	CPS (%)	Exec. time (s)
EDCRAN without TAC	90.6	90.1	92.3	87.0	20.1
EDCRAN without SCC	89.8	89.5	91.5	86.2	20.4
EDCRAN with SMOTE instead of DWBS	91.4	90.9	93.8	88.5	21.0
EDCRAN with class reweighting	92.0	91.6	94.1	89.2	20.8
EDCRAN without DWBS	88.3	87.9	90.1	84.5	19.6
EDCRAN with Chi-Square Feature Selection	92.5	92.0	94.4	89.0	20.2
EDCRAN without SHE	89.1	88.7	91.0	85.2	19.2
EDCRAN without GJO	91.3	90.9	92.8	87.9	25.8
Proposed EDCRAN (Full)	98.9	98.7	99.5	97.8	22.0

DOI: 10.7717/peerj-cs.3531/table-7

The accuracy of the proposed model’s categorization is impacted by the hyperparameters used, as shown in Fig. 26. This demonstrates that the learning rate, batch size, and dropout rate influence model performance and flexibility. The optimal setup has a 99.8 percent success rate with a learning rate of 0.01 and a batch size of 96. Assuming sufficiently large learning rates and batch sizes, the described technique seems to expedite convergence while preserving stability. It is crucial to choose hyperparameter ranges intelligently to prevent a value loss plateau, as seen in a plateau of growing accuracy performance. In varied situations, the model can come together without underperforming or overfitting. Our results show that hyperparameter modification balances prediction accuracy and computational economy. Since factors change and resources are limited, IoT performance requires adaptation. Since it remains accurate when parameters change, the model is valuable in real life. This research shows the EDCRAN architecture’s scalability and durability in varied IoT network contexts and gives academics and practitioners a practical guidance for adjusting hyperparameters to minimize resource utilization and improve detection accuracy.

Figure 26: Sensitivity analysis of proposed model parameters.

Download full-size image

DOI: 10.7717/peerj-cs.3531/fig-26

The Table 8 evaluates forecasting models against performance consistency, correlation strength, and statistical reliability. Strong Spearman’s and Pearson’s correlation scores and a low Paired ANOVA result indicate that the proposed EDCRAN has good generalization and low prediction variability across data subsets. Recursive feature selection and deep learning capture complex data interactions. Bi-LSTM and ResNet-18 have lower correlation and more variability than EDCRAN, but they still struggle to adapt to new features. Traditional methods like Decision Trees and Random Forests are less trustworthy and have higher error margins for high-dimensional Internet of Things data. EHO minimizes execution time and speeds network convergence to enhance EDCRAN. The statistical research shows that EDCRAN can identify DDOS assaults on complicated IoT datasets.

Table 8:

Statistical analysis of different techniques.

Techniques	Chi-squared	Paired ANOVA	Spearman’s	Pearson’s	Student’s
Adaptive transfer learning with CNN (Anley et al., 2024)	12.5	0.031	0.85	0.87	2.50
Deep Belief Networks (DBN) (Wanda & Hiswati, 2024)	14.3	0.029	0.86	0.89	2.35
CNN-LSTM-Transformer (Ouhssini et al., 2024)	10.8	0.042	0.80	0.83	3.20
Correlation-Aware LSTM and transformers (Hekmati et al., 2024)	9.9	0.049	0.79	0.81	3.50
CNN with Reciprocal Points Learning (RPL) (Shieh et al., 2024)	8.8	0.053	0.77	0.79	3.80
Fuzzy Q-Learning with Chebyshev’s inequality (Kumar, Dutta & Pranav, 2024)	15.3	0.027	0.88	0.91	2.30
Adaptive federated learning mechanism (Doriguzzi-Corin & Siracusa, 2024)	16.2	0.024	0.89	0.92	2.20
Lightweight CNN-LSTM models (Salih & Abdulrazaq, 2023)	17.5	0.019	0.90	0.93	2.00
DDoSViT: vision transformer for OTA DDoS detection (Ali et al., 2025)	18.2	0.018	0.91	0.93	2.10
GNN ensemble learning (Bakar et al., 2024)	17.9	0.018	0.91	0.94	2.00
MMEDRL-ADM with metaheuristic tuning (Paidipati, Kurangi & Uthayakumar, 2024)	19.2	0.020	0.92	0.95	1.85
Encoder-Stacked DNN (E-SDNN) (Benmohamed, Thaljaoui & Elkhediri, 2024)	18.4	0.022	0.90	0.93	2.10
Federated CNN for decentralized detection (Alhasawi & Alghamdi, 2024)	16.9	0.025	0.89	0.91	2.25
Multi-Scale CNN-BiLSTM model (Yin, Fang & Liu, 2024)	17.3	0.023	0.91	0.92	2.15
Embedded classification for IoT traffic (Shukla, Krishna & Patil, 2024)	15.7	0.028	0.87	0.89	2.40
TCN-Based detection with dynamic thresholding (Akhi, Eising & Dhirani, 2025)	15.4	0.029	0.88	0.90	2.30
Proposed EDCRAN	20.7	0.016	0.99	1.00	1.70

DOI: 10.7717/peerj-cs.3531/table-8

Conclusion and future work

In this article, we present EDCRAN, a deep learning system that efficiently and successfully detects distributed denial of service attack in real-time inside IoT environments. To capture global temporal and contextual correlations, the design incorporates a Scaled Convolutional Channel and a Transform Attention Channel. Localized and fine-grained traffic patterns may be discovered using these channels. The ability to detect traffic patterns is therefore made feasible. The Dynamic Weighted Balancing Strategy and Selective Hybrid Extraction modules increase stability, manage high-dimensional IoT traffic and intrinsic class imbalance, and train with fair representation. Efficient convergence and decreased computing cost are further enhanced by hyperparameter optimization using the Golden Jackal Optimization. Results from extensive testing on the real-world IN-DDoS24 dataset and comparisons with other benchmark datasets show that EDCRAN outperforms the competition in terms of accuracy, area under the curve, contextual prediction stability, and attack complexity-weighted score. Our findings demonstrate that the model can withstand complex and ever-changing DDoS situations while still capturing a wide range of traffic patterns, indicating that it is computationally suitable for IoT applications focused on the edge. It’s crucial to note that there are certain restrictions, even with these benefits. Despite using real-world traffic collects as datasets, this research did not conduct its assessment in a real-world IoT deployment context. Hardware limitations, unexpected load changes, and device heterogeneity are introduced by real operating systems and are challenging to mimic in offline studies. Additionally, open-set detection, spoof-resistant learning, and adversarial evasion were not covered in this study.

Future research will focus on testing EDCRAN in real-world IoT or edge systems, where hardware-level resource behavior and network variations may be properly analyzed. Further research should focus on developing adaptive or open-set recognition techniques, increasing resilience to adversarial perturbations, and enhancing tolerance to previously undiscovered attack types. Expanding EDCRAN’s practical dependability and facilitating its implementation in large-scale, real-world IoT environments are the goals of these advances.

Supplemental Information

Contains the evaluation scripts used to compute accuracy, precision, recall, F1-score, and confusion matrices for the EDCRAN model.

DOI: 10.7717/peerj-cs.3531/supp-4

Download

Includes the implementation of the Grey Wolf–Jellyfish Optimization (GJO) routine used for hyperparameter tuning of the EDCRAN framework.

DOI: 10.7717/peerj-cs.3531/supp-5

Download

Provides the complete model architecture for the Enhanced Dual-Channel Residual Attention Network used for DDoS detection.

DOI: 10.7717/peerj-cs.3531/supp-6

Download

Contains the preprocessing pipeline, including data cleaning, feature extraction, normalization, and train–test preparation.

DOI: 10.7717/peerj-cs.3531/supp-7

Download

Includes the training loop, loss optimization procedure, checkpoint saving, and performance logging for the EDCRAN model.

DOI: 10.7717/peerj-cs.3531/supp-8

Download

[1] Akhi M, Eising C, Dhirani LL. 2025. TCN-based DDoS detection and mitigation in 5G healthcare-IoT: a frequency monitoring and dynamic threshold approach. IEEE Access 13:12709

[2] Al-Hasani OK, Abdelmaksoud R, Rezk A. 2026. A novel hybrid CNN-LSTM framework for robust ddos attack detection and classification. Journal of Cybersecurity & Information Management 17(1):21

[3] Alhasawi Y, Alghamdi S. 2024. Federated learning for decentralized DDoS attack detection in IoT networks. IEEE Access 12:42357-42368

[4] Ali M, Saleem Y, Hina S, Shah GA. 2025. DDoSViT: IoT DDoS attack detection for fortifying firmware Over-The-Air (OTA) updates using vision transformer. Internet of Things 30(1):101527

[5] Alshdadi AA, Almazroi AA, Ayub N, Lytras MD, Alsolami E, Alsubaei FS. 2024. Big data-driven deep learning ensembler for ddos attack detection. Future Internet 16(12):458

[6] Anley MB, Genovese A, Agostinello D, Piuri V. 2024. Robust ddos attack detection with adaptive transfer learning. Computers & Security 144:103962

[7] Bakar RA, De Marinis L, Cugini F, Paolucci F. 2024. FTG-Net-E: a hierarchical ensemble graph neural network for DDoS attack detection. Computer Networks 250(2):110508

[8] Benmohamed E, Thaljaoui A, Elkhediri S. 2024. E-SDNN: encoder-stacked deep neural networks for DDoS attack detection. Neural Computing and Applications 36(18):10431-10443

[9] Bukhowah R, Aljughaiman A, Rahman MH. 2024. Detection of dos attacks for iot in information-centric networks using machine learning: opportunities, challenges, and future research directions. Electronics 13(6):1031

[10] Doriguzzi-Corin R, Siracusa D. 2024. FLAD: adaptive federated learning for DDoS attack detection. Computers & Security 137:103597

[11] Hekmati A, Zhang J, Sarkar T, Jethwa N, Grippo E, Krishnamachari B. 2024. Correlation-aware neural networks for DDoS attack detection in IoT systems. IEEE/ACM Transactions on Networking 32(5):3929-3944

[12] Huang Z, Wang Z. 2025. Assessing the robustness of physical networks under attack uncertainty. Reliability Engineering & System Safety 262:111231

[13] Kumar A, Dutta S, Pranav P. 2024. FQBDDA: fuzzy q-learning based DDoS attack detection algorithm for cloud computing environment. International Journal of Information Technology 16(2):891-900

[14] Liu Y, Dong X, Zio E, Cui Y. 2025. Active resilient secure control for heterogeneous swarm systems under malicious cyber-attacks. IEEE Transactions on Systems, Man, and Cybernetics: Systems 55(10):1-10

[15] Mahadevappa P, Al-amri R, Alkawsi G, Alkahtani AA, Alghenaim MF, Alsamman M. 2024. Analyzing threats and attacks in edge data analytics within IoT environments. IoT 5(1):123-154

[16] Maiwada UD, Imran SA, Danyaro KU, Janisar AA, Salameh A, Sarlan AB. 2024. Security concerns of IoT against DDoS in 5G systems. International Journal of Electrical Engineering and Computer Science 6:98-105

[17] Malik AR, Farooq ST, Raza MU. 2025. GraphFedAI: a federated graph neural learning framework for distributed IoT DDoS detection. IEEE Internet of Things Journal 12(4):1887-1898

[18] Mishra P, Biancolillo A, Roger JM, Marini F, Rutledge DN. 2020. New data preprocessing trends based on ensemble of multiple preprocessing techniques. TrAC Trends in Analytical Chemistry 132(3):116045

[19] Navaneethakrishnan M, Robinson Joel M, Kalavai Palani S, Gnanaprakasam GJ. 2024. Efficientnet-deep quantum neural network-based economic denial of sustainability attack detection to enhance network security in cloud. Network Computation in Neural Systems 36:1-25

[20] Ouhssini M, Afdel K, Agherrabi E, Akouhar M, Abarda A. 2024. Deepdefend: a comprehensive framework for ddos attack detection and prevention in cloud computing. Journal of King Saud University-Computer and Information Sciences 36(2):101938

[21] Paidipati KK, Kurangi C, Uthayakumar J. 2024. Ensemble of deep reinforcement learning with optimization model for DDoS attack detection and classification in cloud based software defined networks. Multimedia Tools and Applications 83(11):32367-32385

[22] Qiu R, Yin Y, Su Q, Guan T. 2025. Func-Bagging: an ensemble learning strategy for improving the performance of heterogeneous anomaly detection models. Applied Sciences 15(2):905

[23] Rafique SH, Abdallah A, Musa NS, Murugan T. 2024. Machine learning and deep learning techniques for internet of things network anomaly detection current research trends. Sensors 24(6):1968

[24] Saiyed MF, Al-Anbagi I, Hossain MS. 2024. Interactive and explainable optimized learning for ddos detection in consumer IoT networks. IEEE Transactions on Consumer Electronics 71:6839

[25] Salih AA, Abdulrazaq MB. 2023. Cybernet model: a new deep learning model for cyber DDoS attacks detection and recognition. Computers, Materials & Continua 78(1):1275-1295

[26] Shieh C-S, Ho F-A, Horng M-F, Nguyen T-T, Chakrabarti P. 2024. Open-set recognition in unknown DDoS attacks detection with reciprocal points learning. IEEE Access 12(4):56461-56476

[27] Shukla P, Krishna CR, Patil NV. 2024. EIoT-DDoS: embedded classification approach for IoT traffic-based DDoS attacks. Cluster Computing 27(2):1471-1490

[28] Sorour SE, Aljaafari M, Shaker AM, Amin AE. 2025. LSTM-JSO framework for privacy preserving adaptive intrusion detection in federated IoT networks. Scientific Reports 15(1):11321

[29] Sun G, Li Y, Liao D, Chang V. 2018. Service function chain orchestration across multiple domains: a full mesh aggregation approach. IEEE Transactions on Network and Service Management 15(3):1175-1191

[30] Sun G, Zhu G, Liao D, Yu H, Du X, Guizani M. 2019. Cost-efficient service function chain orchestration for low-latency applications in NFV networks. IEEE Systems Journal 13(4):3877-3888

[31] Susan S, Kumar A. 2021. The balancing trick: optimized sampling of imbalanced datasets-a brief survey of the recent state of the art. Engineering Reports 3(4):e12298

[32] Tang D, Dai R, Zuo C, Chen J, Li K, Qin Z. 2025. A low-rate dos attack mitigation scheme based on port and traffic state in SDN. IEEE Transactions on Computers 74(5):1758-1770

[33] Wanda P, Hiswati ME. 2024. Belief-DDoS: stepping up DDoS attack detection model using DBN algorithm. International Journal of Information Technology 16(1):271-278

[34] Wang P, Song W, Qi H, Zhou C, Li F, Wang Y, Zhang Q. 2024. Server-initiated federated unlearning to eliminate impacts of low-quality data. IEEE Transactions on Services Computing 17(3):1196-1211

[35] Wu X, Zou B, Lu C, Wang L, Zhang Y, Wang H. 2025. Dynamic security computing framework with zero trust based on privacy domain prevention and control theory. IEEE Journal on Selected Areas in Communications 43(6):2266-2278

[36] Xu G, Lei L, Mao Y, Li Z, Chen X, Zhang K. 2025. CBRFL: a framework for committee-based byzantine-resilient federated learning. Journal of Network and Computer Applications 238:104165

[37] Yin X, Fang W, Liu Z. 2024. A novel multi-scale CNN and Bi-LSTM arbitration dense network model for low-rate DDoS attack detection. Scientific Reports 14(1):5111

[38] Yin H, Vahdat A, Alvarez JM, Mallya A, Kautz J, Molchanov P. 2022. A-VIT: adaptive tokens for efficient vision transformer.

[39] Yue M, Yan H, Han R, Wu Z. 2025. A DDoS attack detection method based on IQR and DFFCNN in SDN. Journal of Network and Computer Applications 240(24):104203

[40] Zeng Y, Zhang R, Zhang H, Qiao S, Huang F, Tian Q, Peng Y. 2025. GCCNet: a novel network leveraging gated cross-correlation for multi-view classification. IEEE Transactions on Multimedia 27(5):1086-1099