VAE-XGBoost: a hybrid intrusion detection system for next generation EV charging networks
- Published
- Accepted
- Received
- Academic Editor
- Ankit Vishnoi
- Subject Areas
- Artificial Intelligence, Computer Networks and Communications, Cryptography, Security and Privacy, Neural Networks
- Keywords
- Intrusion detection system, Variational AutoEncoder, XGBoost, EV charging, Electric vehicles
- Copyright
- © 2026 Asim et al.
- Licence
- This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
- Cite this article
- 2026. VAE-XGBoost: a hybrid intrusion detection system for next generation EV charging networks. PeerJ Computer Science 12:e3506 https://doi.org/10.7717/peerj-cs.3506
Abstract
The widespread adoption of electric vehicles (EVs) is crucial for reducing greenhouse gas emissions, yet it exposes charging infrastructure to sophisticated cyber threats. In the next generation EV charging networks, ultra-low latency, massive device connectivity, and AI-driven automation are key technologies. Traditional intrusion detection systems (IDS) struggle with class imbalance, overfitting, and real-time anomaly detection. These limitations threaten user privacy, service continuity, and grid stability. This study proposes a Variational Auto Encoder (VAE) based XGBoost, a hybrid IDS that leverages VAE for feature extraction and handling imbalanced attack data, while XGBoost improves classification accuracy. Evaluated on the CICEVSE2024 dataset, the model surpasses traditional methods like K-Nearest Neighbors and Random Forest, achieving an accuracy of 88.75%, a precision of 88.73%, a recall of 88.75%, and an F1-score of 88.67%. By integrating AI-driven anomaly detection into 6G-enabled smart grids, VAE-XGBoost enhances the cybersecurity resilience of EV charging infrastructure, ensuring scalability, adaptability to novel threats, and real-time mitigation for sustainable and secure transportation networks.
Introduction
The rapid proliferation of electric vehicles (EVs) throughout the world has led to increasing emphasis on the development and deployment of secure, efficient and widely accessible charging infrastructure. As the transition from internal combustion engines to electric mobility accelerates, charging stations have become a critical component of the EV ecosystem, supporting public and private transportation networks. According to recent market analyses, the global EVs charging station market is estimated to be worth USD 7.3 billion in 2024 and is projected to grow to USD 12.1 billion by 2030 and reflects an annual growth rate of 8.8% for the forecast period (Madaram et al., 2024). This growth is fueled by multiple factors, including government incentives, stricter emission regulations, increased investments in renewable energy, and rapid advancements in battery and charging technologies. In addition, the expansion of fast charging networks, integration with smart grids, and the development of interoperable payment and authentication systems are further driving the adoption of EV charging solutions. Ensuring cybersecurity, user authentication, and energy efficiency are also becoming essential design considerations, especially with the rise of IoT-enabled and vehicle-to-grid (V2G) capabilities.
Figure 1 depicts a typical EV charging architecture comprising the Electric Vehicle Charging Station (EVCS), Central Management System (CMS), and power grid, all interconnected through standardized communication protocols. The Open Charge Point Protocol (OCPP) ensures interoperability between EVCS and CMS backend systems, enabling functionalities such as remote monitoring, billing, configuration, and security diagnostics (Hamdare et al., 2025). Within this architecture, the CMS integrates an IDS to continuously monitor network traffic and detect potential security breaches in real time. The ISO 15118 and IEC 63110 standards further strengthen this ecosystem, where ISO 15118 provides secure EV–EVCS communication for smart charging and Vehicle-to-Grid (V2G) operations through automated authentication and Plug & Charge mechanisms, while IEC 63110 defines interfaces between charging infrastructure, energy management systems, and grid operators to ensure end-to-end interoperability and operational reliability (Benfarhat et al., 2025b).
Figure 1: EV charging network architecture with integrated machine learning–based IDS.
Although the CMS hosts the central IDS, cyberattacks can also target peripheral components such as the EV or EVCS. Addressing these threats requires a multi-layered defense strategy that combines local detection with centralized analysis. Lightweight IDS modules deployed at the EVCS can identify anomalies in communication or firmware tampering at the edge, while ISO 15118-based encrypted channels protect EV communications from interception and spoofing attempts. Meanwhile, the centralized IDS at the CMS aggregates and correlates data from multiple charging stations to detect coordinated large-scale attacks. This distributed detection framework improves situational awareness, reduces response latency, and strengthens the overall resilience of the EV charging network. However, OCPP-based systems remain vulnerable to threats such as man-in-the-middle, backdoor, and denial-of-service attacks, highlighting the need for more adaptive and intelligent defense mechanisms (Aljohani & Almutairi, 2024).
Conventional IDS techniques, such as signature-based and anomaly-based detection, offer limited effectiveness against unknown or zero-day exploits. Consequently, IDS frameworks based on machine learning (ML) and deep learning (DL) have gained attention to improve detection accuracy and real-time resilience in EV charging networks. Several studies have demonstrated the effectiveness of ML/DL models in securing EV networks. For example, Mehedi et al. (2021) implemented a LeNet-based deep learning model to detect cyberthreats within Controller Area Networks (CAN), achieving high detection accuracy. However, real-time processing constraints remained a challenge. Similarly, Moulahi et al. (2021) evaluated multiple machine learning classifiers, including Random Forest (RF), Decision Trees (DT), and Multi-Layer Perceptrons (MLP) for in-vehicle IDS, finding RF to be the most effective. Despite its high accuracy, the RF faced scalability issues in handling large-scale EV data streams.
To improve efficiency, Transformer-based architectures have been introduced. Nguyen, Nam & Kim (2023) proposed a Transformer-Based Attention Network (TAN) for intrusion detection in EV networks, achieving notable success in identifying sequential CAN ID anomalies. Although transformer models demonstrated strong performance, they faced challenges in maintaining efficiency as the size of the detection window increased, limiting their practical deployment. More recent approaches, such as ResNet based supervised constrast by Hoang & Kim (2024) and Voting Classifier (VT)-based IDS by Ul Islam Khan et al. (2024), have further enhanced robustness against known cyber threats. However, existing ML/DL models still encounter issues such as overfitting to specific attack patterns and limited adaptability to novel cyber threats.
The emergence of 6G-enabled EV charging networks introduces new opportunities to improve IDS capabilities while also posing novel security challenges. 6G networks will support ultra-reliable, low-latency communication (URLLC), massive machine-type communication (mMTC), and AI-driven network security, making them ideal for protecting EV infrastructure. With AI-native security architectures, 6G can facilitate adaptive intrusion detection using federated learning to continuously train IDS models in distributed EV charging stations without exposing sensitive data (Almehdhar et al., 2024). Moreover, integration of blockchain-based authentication in next generation environments can ensure secure transactions and data integrity within charging networks, mitigating risks such as unauthorized access and fraudulent energy transactions (Mekkaoui, Mekour & Teggar, 2024).
Despite these advancements, securing 6G-enabled EV networks requires addressing several key challenges. The increased attack surface due to hyper connectivity exposes EV charging stations to more sophisticated cyber threats, including AI-powered adversarial attacks that can manipulate IDS models. Furthermore, while quantum-resistant cryptography is expected to strengthen the 6G security frameworks, its implementation remains in the early stages. Future research must focus on developing lightweight, real-time IDS solutions that leverage self-learning, explainable AI (XAI) and edge computing to enhance threat detection, mitigation and decision making within 6G-based EV ecosystems.
As EV adoption accelerates globally, the charging network has become a cornerstone of sustainable transportation. However, the increasing integration of communication technologies and connectivity within these networks introduces critical cybersecurity vulnerabilities. Attack surfaces are expanding across charging stations, vehicle networks, and Vehicle-to-Grid (V2G) communication links, posing substantial risks to service continuity, user privacy, and grid stability. These risks are further exacerbated by the emergence of cyber threats powered by 6G and AI, against which traditional intrusion detection systems (IDS) often fail due to challenges in handling class imbalance, real-time adaptability, and dynamic threat landscapes.
This study introduces a hybrid AI-enabled IDS specifically designed to secure 6G-enabled EV charging infrastructures. The proposed framework combines a VAE for feature extraction with XGBoost for robust classification, particularly effective in imbalanced data scenarios. The key contributions of this work are as follows.
The proposed VAE-XGBoost hybrid model uses the VAE ability to reduce data dimensionality and extract meaningful latent features, and XGBoost effectiveness in handling imbalanced datasets and improving classification accuracy.
The CICEVSE2024 dataset is used and evaluated using the hybrid model and performs better the conventional methods such as KNN and Random Forest in terms of accuracy, precision, recall, and F1-score.
Achieves a balanced trade-off between computational efficiency and detection performance, enabling deployment in real-time, resource-constrained EV environments.
To our knowledge, this study presents a comprehensive per-class evaluation in all attack categories. Unlike previous work, for example, Benfarhat et al. (2025a), which often reports only cumulative performance metrics, our analysis includes per-class accuracy, precision, recall, and F1-scores, allowing a more nuanced understanding of model behavior, especially in minority classes.
The proposed framework effectively detects a broad spectrum of threats, but challenges remain in distinguishing low-frequency and overlapping attacks and in optimizing computational performance for large-scale applications. However, this work represents a significant advancement towards secure, resilient, and intelligent EV charging networks.
The remainder of this article is structured as follows. ‘Literature Review’ provides the literature review that explores existing research on AI-driven Intrusion Detection Systems (IDS) in EV charging networks, highlighting challenges such as class imbalance, real-time detection limitations, and evolving cyber threats. ‘Intrusion Detection in EV Charging Networks’, provides details on machine learning techniques to design an intrusion-free network for the 6G EV charging network. ‘Research Methodology’ discusses in detail the methodology used for the VAE-XGBoost framework, including data preprocessing, feature selection, and model training. It also outlines the evaluation metrics used for the performance assessment. ‘Performance Evaluation’ discusses the results and discussion and presents experimental findings, comparing the VAE-XGBoost model with baseline approaches, for example, K-Nearest Neighbors, Random Forest to demonstrate its effectiveness in detecting cyber threats. Finally, ‘Conclusion’ summarizes key findings, discusses the study’s contributions to EV cybersecurity, and suggests future directions.
Literature review
With the rise of 6G technology and the proliferation of EV charging networks, ensuring cybersecurity within these infrastructures has become critical. IDS serve as fundamental security mechanisms, continuously monitoring networks to detect unauthorized access attempts and malicious activities. IDS can be broadly classified into Network-Based IDS (NIDS), which analyzes network traffic patterns, and Host-Based IDS (HIDS), which focuses on individual systems by scrutinizing logs, file integrity, and user activity (Sowmya & Anita, 2023). The integration of these approaches into hybrid IDS improves security, using both network and system-level information to detect sophisticated cyber threats.
Various IDS methodologies have been explored to protect vehicular networks against emerging cyber threats. Signature-based IDS identify known attack patterns, while anomaly-based detection recognizes deviations from normal behavior. Additionally, behavior-based detection learns the activities of the system over time to identify possible compromises (Sowmya & Anita, 2023). However, IDS effectiveness is often challenged by false positives and false negatives, necessitating advanced techniques such as deep learning and hybrid detection models to enhance accuracy and reliability.
Machine learning-based intrusion detection has been extensively studied for EV networks. In Kosmanos et al. (2020), a machine learning-based IDS was proposed to counter spoof attacks on connected EVs. The system integrates a probabilistic cross-layer approach with Position Verification using Relative Speed (PVRS), improving spoofing attack detection with an accuracy of 91% in urban simulations involving 50 to 200 EVs. In Guo, Ye & Yang (2020), a physics-guided machine learning approach is introduced for physical attributes such as voltage, current, and torque to enhance the detection accuracy of cyber attacks to 99%, outperforming conventional data-driven models.
Deep learning techniques have demonstrated significant promise in intrusion detection for EV networks. Zhang et al. (2020) utilizes Long Short-Term Memory (LSTM) networks to detect Sybil attacks in vehicular networks without requiring prior knowledge, achieving a detection rate of 95% using Cooperative Awareness Message (CAM) datasets. Similarly, Mehedi et al. (2021) proposes an IDS leveraging deep transfer learning and the LeNet model on Controller Area Network (CAN) data, achieving an accuracy of 98.10% in detect flood, fuzzing, and spoofing attacks.
Comparative studies on machine learning models for CAN-based intrusion detection have been conducted. Moulahi et al. (2021) evaluates multi-layer perceptron, RF, decision tree, and support vector machine using data from a KIA Soul vehicle, identifying the random forest as the most effective model, achieving an accuracy of 98.53%. In Yang, Moubayed & Shami (2021), a multi-tiered hybrid IDS is introduced, integrating anomaly and signature-based detection with decision trees, random forest, and K-Means clustering optimized through Bayesian techniques, achieving 99.99% accuracy on the CAN-intrusion dataset.
Several lightweight IDS models have been developed for in-vehicle networks. Basavaraj & Tayeb (2022) introduces a deep neural network-based IDS for CAN, achieving 98.68% accuracy in detecting DoS and fuzzing attacks. Lin et al. (2022) explores a VGG16 deep learning classifier trained in the Hacking and Countermeasure Research Lab (HCRL) data set, demonstrating 97.82% precision in detecting intrusions into the vehicle network. In Zhang & Ma (2022), a hybrid approach is proposed that combines rule-based and machine learning detection, utilizing a deep neural network (DNN) to improve accuracy to 99% in different vehicle models.
For real-time intrusion detection, ElKashlan et al. (2023) employs a decision table and trained filtered classifiers on the IoT-23 dataset, achieving 99.99% precision in detecting distributed denial-of-service (DDoS) attacks with a response time of 0.75 s. Transformer-based models are introduced in Nguyen, Nam & Kim (2023) for CAN bus intrusion detection, significantly improving replay attack detection with 99.75% precision and 99.42% recall.
Security solutions specifically designed for vehicular networks enabled by 6G have also been explored. Sousa, Magaia & Silva (2023) introduces an intrusion detection system (IDS) to identify flooding attacks in the 5G-enabled Internet of Vehicles (IoV). The system employs DT and RF classifiers, achieving an accuracy of 97% in the datasets generated using NS-3 and SUMO simulations. Almehdhar et al. (2024) surveys advanced IDS techniques for Intelligent Vehicle Networks (IVNs), emphasizing deep learning models, federated learning, and transformer architectures.
Emerging techniques for securing EV charging stations have been proposed in Kilichev, Turimov & Kim (2024), where a Network Intrusion Detection System (NIDS) integrates Convolutional Neural Networks (CNN), LSTM and Gated Recurrent Units (GRU) models, trained on the Edge-IIoTset dataset, achieved 100% accuracy in binary classification and 97.44% in multi-class classification. A bio-inspired IDS, based on the Grouping Cockroaches Classifier (GCC), is introduced in Mekkaoui, Mekour & Teggar (2024) for Vehicle-to-Grid (V2G) networks, detecting DoS and Man-in-the-Middle (MitM) attacks with an accuracy of 98.93%.
The growing connectivity of EVs introduces significant cybersecurity risks, with documented attacks causing operational failures and underscoring the need for robust security measures (Jeong & Choi, 2022). Research has examined vulnerabilities in charging infrastructure employing machine learning and generative models to analyze and secure charging data. For example, Buechler et al. (2021) proposed a GAN-based model (EVGen) that achieved a 15% improvement in capturing the temporal dynamics of EV charging patterns compared to Gaussian Mixture Models. Acharya et al. (2020) used supervised learning techniques to detect anomalies in EV charging data, achieving a detection accuracy of 97.5%. In parallel, studies have addressed safety issues during the charging process by evaluating malfunction risks and proposing standardized protocols. Wang et al. (2019) analyzed 12 case studies of large-scale charging station failures, while Hamdare et al. (2023) validated cybersecurity risks using data from the real-world charging session, identifying more than 25 distinct vulnerabilities in communication protocols and physical layers.
Key threats such as false data injection (FDIA), denial-of-service (DoS), and MitM attacks target EV charging networks, especially in 5G-enabled SCADA environments (Akbarian et al., 2024). Akbarian et al. (2024) introduced an ML-based detection framework that reduced the impacts of FDIA by more than 80% and achieved an accuracy of 94.6% in attack classification. Defense mechanisms using fuzzy logic and blockchain-enhanced authentication protocols have also shown promise: Huang et al. (2018) proposed a blockchain-based model that improves transaction integrity with a latency overhead below 3 ms. Ahalawat, Adepu & Gardiner (2022) developed a lightweight authentication protocol that achieved a 98.2% authentication success rate in simulation. Moreover, dynamic wireless charging introduces new security and privacy challenges; Mohanty, Suresh Babu & Salkuti (2022) classified five main threats and proposed mitigation protocols rated for 96% reliability in simulated dynamic EV environments. Despite these advancements, persistent vulnerabilities, such as communication spoofing and data injection, highlight the need for robust hybrid cybersecurity frameworks tailored to the complex EV ecosystem (Hamdare et al., 2023).
Innovative hybrid approaches have also been developed for vehicular IDS. Hoang & Kim (2024) presents a Supervised Contrastive (SupCon) ResNet model for CAN bus intrusion detection, achieving a nearly perfect 100% precision and an F1-score of 99.97%. Ul Islam Khan et al. (2024) designs a MATLAB Simulink prototype using a Voting Classifier combining decision tree, logistic regression, and XGBoost for fault detection in EVs, achieving an accuracy of 98.3%.
These studies collectively highlight the rapid advances in IDS methodologies tailored to secure 6G-enabled EV charging networks and vehicular communications. The integration of deep learning, hybrid models, and bio-inspired techniques is crucial to address the evolving threat landscape, ensuring robust protection against cyber threats in next-generation intelligent transportation systems.
Intrusion detection in EV charging networks
The rapid adoption of EVs demands a secure and resilient charging infrastructure to prevent cyber threats. The integration of 6G technologies improves efficiency, real-time data exchange, and Vehicle-to-Grid (V2G) interactions but introduces new attack surfaces (Wazeer et al., 2023). These vulnerabilities can lead to data breaches, DoS attacks, and unauthorized access, making IDS essential for cybersecurity in EV charging networks (Timilsina et al., 2023).
Several machine learning algorithms have been employed for intrusion detection:
K-nearest neighbors
A distance-based classifier effective in detecting known attack patterns, but limited by high-dimensional data and computational inefficiency (Suyal & Goyal, 2022). It classifies a new data point by considering the class of its k nearest neighbors using the Euclidean distance:
(1) where and are data points in a -dimensional feature space.
Random forest
An ensemble learning model that improves classification accuracy by aggregating multiple decision trees (Amin et al., 2023). RF reduces overfitting but struggles with large-scale, high-dimensional data. The prediction function for RF is given by:
(2) where represents individual decision trees and T is the total number of trees.
XGBoost
A gradient-boosting model optimized for handling high-dimensional and imbalanced datasets, making it suitable for real-time intrusion detection (Guo et al., 2020). It uses a regularized objective function:
(3) where is the loss function and is the regularization term to increase trees to prevent overfitting.
Despite their effectiveness, these models struggle with class imbalance, feature redundancy, and novel attack detection, necessitating a more advanced hybrid approach.
Research methodology
This section describes the research methodology used to develop the proposed intrusion detection framework. It presents the selection of the CICEVSE2024 dataset, data preprocessing techniques, and partitioning strategies. Furthermore, it elaborates on the architecture of the VAE-XGBoost model and its implementation as depicted in Fig. 2.
Figure 2: Research methodology for VAE-XGBoost based intrusion detection.
Data collection
The dataset from the Canadian Institute for Cybersecurity Electric Vehicle Supply Equipment (CICEVSE2024) (https://www.unb.ca/cic/datasets/evse-dataset-2024.html) (Buedi et al., 2024) is utilized for intrusion detection in EV charging systems. This dataset, collected from the University of New Brunswick, North America, provides a comprehensive benchmark for evaluating cybersecurity threats in EV infrastructure.
The CICEVSE2024 dataset consists of 6,166 entries in the host event table and 2,304 entries in the range index table. Each table contains 911 and 915 columns, respectively, with 119 overlapping columns bearing identical names. These redundancies require careful preprocessing to ensure data integrity and avoid feature duplication during model training.
The dataset encompasses 17 distinct network activity classes, each assigned a unique encoded label. These labels standardize classification tasks, allowing machine learning models to effectively differentiate between normal operations and potential cyber threats. In the following, we provide a brief overview of the attack types covered in the dataset.
-
1.
Aggressive scan: High-frequency probing to rapidly gather information about EV charging networks, often detected by excessive network traffic.
-
2.
Backdoor: Bypass authentication to grant remote unauthorized access, allowing attackers to control charging systems undetected.
-
3.
Cryptojacking: Exploit charging station resources for unauthorized cryptocurrency mining, degrading performance, or causing infrastructure damage.
-
4.
ICMP flood: Overloads the network with ICMP packets, leading to DoS and disruption of EV system responses.
-
5.
ICMP fragmentation: Exploits packet reassembly vulnerabilities, destabilizing the charging network and potentially causing service failures.
-
6.
None: Represents normal traffic with no suspicious or malicious behavior.
-
7.
OS fingerprinting: Identifies the operating system of EV charging stations to aid attackers in planning targeted exploits.
-
8.
OS scan: Gathers detailed OS configuration data to expose potential security vulnerabilities.
-
9.
Port scan: Identifies open ports on EV systems, helping attackers locate entry points for unauthorized access.
-
10.
Push ack flood: Overwhelms the target with TCP push-ack packets, leading to a DoS attack.
-
11.
Service detection: Identifies active services on EV systems to exploit service-specific vulnerabilities.
-
12.
SYN flood: Sends excessive TCP SYN requests to exhaust system resources, preventing legitimate access.
-
13.
SYN stealth: Execute a “half-open” scan by initiating but not completing TCP handshakes, evading standard detection methods.
-
14.
Synonymous IP flood: Using multiple similar-looking IP addresses to flood EV networks, making it difficult to distinguish between legitimate and malicious traffic.
-
15.
TCP flood: Overloading the system with TCP packets leads to service outages.
-
16.
UDP flood: Sends a large volume of UDP packets to overwhelm the network, reducing service reliability.
-
17.
Vulnerability scan: Identifies security weaknesses in EV charging systems that attackers can later exploit.
Data preprocessing
The dataset undergoes a comprehensive preprocessing pipeline, including data cleaning to handle missing and duplicate values, followed by normalization and transformation. Each step is outlined below.
Data loading and merging
The EV charging datasets, each capturing recorded events, are initially loaded. Since data from different sources may vary in structure, only the common columns are retained to maintain consistency. The shared features are consolidated into a unified dataset, serving as the foundation for further processing and analysis.
Removing irrelevant columns
Certain columns, such as time, state, label, interface, and unit, do not directly contribute to intrusion detection and are removed to reduce noise and improve model efficiency. In addition, missing values records are identified and removed to prevent inaccuracies that could affect model performance. This step improves computational efficiency while preserving the integrity of the dataset.
Label standardization and encoding
The Attack column, serving as the target variable, contained categorical labels with inconsistencies, including misspellings (e.g., serice detection) and variations in formatting (e.g., icmp fragmentation). These inconsistencies are corrected for uniformity before applying label encoding, which converts categorical labels into numerical values. This transformation ensures compatibility with machine learning algorithms, enabling accurate classification.
Feature normalization
To ensure consistency between features with varying scales, a Min-Max scaler is applied to normalize all values within the range [0, 1]. This step prevents features with larger numerical values from dominating the learning process, promoting balanced training, and improving model stability.
Attack distribution
The distribution of attack classes is illustrated in Fig. 3. It is evident that the dataset is heavily imbalanced, with three dominant classes namely None, Backdoor, and Cryptojacking, accounting for approximately 75% of the total samples. Such a class imbalance poses a significant challenge in the development of effective machine learning models, as it can lead to biased predictions that favor the majority classes while underrepresenting the minority ones. In imbalanced datasets, standard learning algorithms tend to achieve a high overall accuracy by simply prioritizing frequent classes, often at the expense of detection performance in rare but critical attack types.
Figure 3: Distribution of attack types in the CICEVSE2024 dataset.
Data partitioning
The final preprocessed dataset is divided into training and testing sets using a 70:30 ratio. A total of 5,927 samples are allocated for training, while 2,541 samples are reserved for evaluation. The dataset consists of 113 features that are used to train and test the proposed VAE-XGBoost model for intrusion detection in EV charging systems.
A hybrid model for intrusion detection
To address the limitations of existing IDS in the EV charging infrastructure, we propose a hybrid model that integrates a variational autoencoder with XGBoost. This approach applies the VAE’s ability to learn compact, high-level representations of input data while utilizing XGBoost’s robust classification capabilities. By combining unsupervised feature extraction with a powerful gradient-boosting classifier, the proposed model enhances detection accuracy and reduces false positive rates.
Variational autoencoder for feature representation
Variational Autoencoders (VAEs) are generative models designed to learn low-dimensional latent representations of high-dimensional data, making them effective for anomaly detection in cybersecurity (Kingma & Welling, 2014). A VAE comprises an encoder that maps the input to a probabilistic latent space and a decoder that reconstructs from . The training objective is to optimize the following variational loss function.
(4) where represents the Kullback-Leibler (KL) divergence, measuring the difference between the approximate posterior distribution and the prior distribution . By minimizing this divergence, the VAE effectively captures essential latent features while preventing overfitting. This capability enables the extraction of discriminative attack patterns, mitigates feature redundancy, and improves the resilience of the model to class imbalance. As a result, the VAE-XGBoost framework improves the robustness of IDS models in detecting sophisticated cyber threats within EV charging ecosystems.
VAE training details
The VAE component of our framework utilizes a symmetric encoder-decoder architecture with two hidden layers (256 and 128 neurons) and a 20-dimensional latent space, optimized to balance feature compression and reconstruction fidelity. We employ the Evidence Lower Bound (ELBO) objective with to weight the KL divergence term, preventing posterior collapse while maintaining reconstruction quality (input: 64-dimensional scaled features). Training converges when reconstruction loss (MSE) stabilizes below 0.01 for five consecutive epochs (Adam optimizer, lr = 0.001, batch size = 64), ensuring robust feature extraction for imbalanced attack classes.
XGBoost for classification
Features are extracted using VAE and then XGBoost is applied for the final classification. Applying its gradient-boosting mechanism to improve accuracy, regularization techniques to prevent overfitting, and scalability for real-time IDS applications. The final intrusion detection decision is formulated as follows:
(5) where represents the latent space representation obtained from VAE. By combining VAE feature extraction capabilities with XGBoost classification power, the proposed hybrid model effectively detects cyber-threats in EV charging networks with improved accuracy and robustness.
A VAE is a generative model that encodes input data in a probabilistic latent space, capturing complex data distributions rather than mapping input to fixed feature points (Subray, Tschimben & Gifford, 2021). This latent representation enables the generation of new samples by drawing from the learned distribution, preserving essential patterns in the data. XGBoost is an advanced machine learning algorithm that refines traditional gradient boosting by optimizing computational efficiency and predictive accuracy. It constructs an ensemble of decision trees, where each tree iteratively corrects the residual errors of the previous ones to minimize the loss function (Guo et al., 2020).
The VAE-XGBoost framework for intrusion detection in EV charging systems. The VAE encodes input data into a structured latent space , defined as:
(6) where is the mean, is the standard deviation and is a random sample from a standard normal distribution. This probabilistic representation enables the detection of deviations from normal charging behavior. XGBoost then classifies these representations, effectively distinguishing between normal and anomalous activity, thus enhancing the detection of potential cyber intrusions. The decoder reconstructs the input from this latent representation. The extracted latent features are then fed into XGBoost, which calculates node loss to determine the most informative splits. It iteratively refines the predictions by adding trees that minimize residual errors. After training K trees, the final prediction of a sample is the aggregated score of all the corresponding leaf nodes. The integration of VAE and XGBoost enhances intrusion detection by leveraging VAEs for dimensionality reduction and feature extraction, followed by XGBoost for classification. This combination enables efficient cyber threat detection while maintaining computational efficiency.
As described in Algorithms 1 and 2, the VAE-XGBoost framework integrates the generative power of VAEs with the predictive strength of XGBoost to detect intrusions in EV charging networks. As presented in Algorithm 1, input data undergoes pre-processing, normalization, and partitioning into training and testing sets before being fed into VAE. VAE encodes features into a compact latent space, balancing reconstruction quality with latent space regularization. These extracted features are then classified using XGBoost as presented in Algorithm 2, which minimizes an objective function comprising prediction loss and regularization. The model is fine-tuned by optimizing hyperparameters such as tree depth and learning rate. Performance is evaluated using classification metrics, ensuring robust detection of cyber threats.
| Require: Encoded dataset |
| Ensure: Trained VAE and extracted latent features |
| 1: Feature Engineering: Extract features and target |
| 2: Data Normalization: Apply MinMaxScaler |
| 3: Convert normalized data back into a DataFrame |
| 4: Split Data: Divide into training and testing sets |
| 5: Define VAE Parameters: |
| 6: Train Variational Autoencoder (VAE) |
| 7: Define encoder network: |
| • 512, 256, 128 neurons with ReLU activation |
| • Batch Normalization and Dropout (0.3) |
| 8: Apply reparameterization trick: |
| 9: Define decoder network: |
| • 128, 256, 512 neurons with ReLU activation |
| • Batch Normalization and Dropout (0.3) |
| • Sigmoid output layer |
| 10: Train VAE with loss: |
| 11: Extract latent space features: |
| 12: Latent Space Normalization: Scale features for XGBoost |
| Require: Scaled latent features , labels |
| Ensure: Trained XGBoost model and intrusion classification |
| 1: Train XGBoost Classifier on Latent Features |
| 2: Define XGBoost model: |
| 3: Train on scaled latent space features: |
| 4: Evaluate Model: Predict using test data |
| 5: Compute accuracy, precision, recall, and F1-score |
| 6: Intrusion Detection |
| 7: Encode new data: |
| 8: Normalize: |
| 9: Classify: |
Unlike principal component analysis or conventional autoencoders, VAE leverage probabilistic encoding to more effectively manage imbalanced EV data attacks while preserving the latent structure, an essential property for identifying rare and subtle intrusion patterns (Kingma & Welling, 2014). In parallel, compared to traditional classifiers such as Support Vector Machines (SVM), which are constrained by kernel scalability, and RF, which often face limitations in handling large-scale data, XGBoost demonstrates superior performance. Its integrated regularization and parallel processing capabilities enable it to achieve higher accuracy with significantly lower inference latency, making it particularly suitable for real-time intrusion detection in EV networks (Guo et al., 2020).
Performance evaluation
The research is conducted in a development environment using Visual Studio Code with Python 3.11.5 on a 64-bit Microsoft Windows 10 system. The experiments are performed on a machine powered by an Intel 2.40 GHz processor with 8GB of RAM, ensuring efficient execution of computational tasks. Various machine learning models, including VAE-XGBoost, XGBoost, KNN, and RF, are used to detect intrusions in EV charging systems. These models are trained and evaluated using a curated dataset of network traffic, allowing a comprehensive assessment of their ability to identify potential security threats. Performance metrics such as accuracy, precision, recall, and F1-score discussed in ‘Performance Metrics’ are used to compare their effectiveness, ensuring a robust evaluation of intrusion detection capabilities.
Performance metrics
To evaluate model performance, standard classification metrics such as the confusion matrix, accuracy, precision, recall, and F1-score are used, as discussed in this section.
Confusion matrix
The confusion matrix is a contingency table used to assess the performance of a classification model by comparing predicted labels with actual labels, as shown in Table 1.
| Actual class | ||
|---|---|---|
| Predicted class | Positive | Negative |
| Positive | True Positive (TP) | False Positive (FP) |
| Negative | False Negative (FN) | True Negative (TN) |
Accuracy
Accuracy measures the proportion of correctly classified instances, that is true positives and true negatives relative to the total number of instances. The accuracy is computed as follows:
(7)
Precision
Precision quantifies the reliability of positive predictions by calculating the proportion of correctly identified positive instances among all instances predicted as positive. It is defined as:
(8)
Recall
It is also known as the sensitivity or true positive rate, measures the proportion of actual positive instances that the model correctly identifies. It is computed as follows:
(9)
F1-score
It is the harmonic mean of precision and recall, providing a single metric that balances both false positives and false negatives. The F1-score is calculated as:
(10)
These evaluation metrics align well with the cybersecurity requirements of EV systems, such as operational continuity, timely threat detection, and assessment of attack severity. These are widely adopted metrics in IDS research (Mehedi et al., 2021; Zhang & Ma, 2022).
Simulation results
This section presents the performance evaluation of KNN, RF, XGBoost, and VAE-XGBoost. The effectiveness of each algorithm is assessed using the confusion matrix, accuracy, recall, and F1-score.
Performance of KNN algorithm
Figure 4 presents the confusion matrix. The classification metrics, including accuracy, precision and F1-score for all classes using the KNN model are presented in Table 2. Figure 4 highlights the strong classification performance of the model, particularly for the main threat categories. It achieves high accuracy in detecting backdoor and cryptojacking. However, some misclassifications are observed, especially in aggressive scans which were not detected and placed in none. Likewise, minor classification errors occur among flood-related threats, such as ICMP flood and ICMP fragmentation, likely due to overlapping characteristics. Low-frequency attack classes, such as push-ack flood and synonymous IP flood, are more prone to misclassification, which can be attributed to their limited representation in the dataset.
Figure 4: Confusion matrix for KNN algorithm.
| Attack type | KNN | Random forest | XGBoost | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Precision | Recall | F1 | Precision | Recall | F1 | Precision | Recall | F1 | |
| aggressive-scan | 0.25 | 0.02 | 0.03 | 0.50 | 0.04 | 0.07 | 0.26 | 0.26 | 0.26 |
| backdoor | 0.90 | 0.98 | 0.94 | 0.87 | 0.98 | 0.92 | 0.95 | 0.98 | 0.97 |
| cryptojacking | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 |
| icmp-flood | 0.51 | 0.79 | 0.62 | 0.46 | 0.50 | 0.48 | 0.49 | 0.47 | 0.48 |
| icmp-fragmentation | 0.40 | 0.07 | 0.12 | 0.74 | 0.52 | 0.61 | 0.66 | 0.65 | 0.65 |
| none | 0.86 | 1.00 | 0.93 | 0.79 | 1.00 | 0.88 | 0.96 | 0.97 | 0.96 |
| os-fingerprinting | 0.28 | 0.41 | 0.33 | 0.20 | 0.05 | 0.09 | 0.36 | 0.32 | 0.34 |
| os-scan | 0.23 | 0.56 | 0.32 | 0.00 | 0.00 | 0.00 | 0.25 | 0.28 | 0.26 |
| port-scan | 0.90 | 0.15 | 0.26 | 0.93 | 0.23 | 0.37 | 0.44 | 0.33 | 0.38 |
| push-ack-flood | 0.39 | 0.45 | 0.41 | 0.51 | 0.66 | 0.57 | 0.49 | 0.55 | 0.52 |
| service-detection | 0.20 | 0.13 | 0.16 | 0.31 | 0.23 | 0.27 | 0.40 | 0.30 | 0.34 |
| syn-flood | 0.67 | 0.53 | 0.59 | 0.80 | 0.53 | 0.63 | 0.81 | 0.66 | 0.72 |
| syn-stealth | 0.20 | 0.16 | 0.17 | 0.19 | 0.11 | 0.13 | 0.20 | 0.21 | 0.21 |
| synonymous-ip-flood | 0.38 | 0.14 | 0.20 | 0.47 | 0.19 | 0.27 | 0.47 | 0.50 | 0.49 |
| tcp-flood | 0.78 | 0.49 | 0.60 | 0.86 | 0.49 | 0.62 | 0.69 | 0.68 | 0.68 |
| udp-flood | 0.54 | 0.61 | 0.57 | 0.53 | 0.67 | 0.59 | 0.62 | 0.50 | 0.55 |
| vuln-scan | 0.32 | 0.17 | 0.22 | 0.00 | 0.00 | 0.00 | 0.15 | 0.17 | 0.16 |
| Accuracy | 0.82 | 0.82 | 0.84 | ||||||
| Macro Avg | 0.52 | 0.45 | 0.44 | 0.54 | 0.42 | 0.44 | 0.54 | 0.52 | 0.53 |
| Weighted Avg | 0.80 | 0.82 | 0.79 | 0.78 | 0.82 | 0.78 | 0.84 | 0.84 | 0.84 |
Table 2 presents the precision, recall, and F1-score for all classes, showcasing the strengths and limitations of the model. The KNN model demonstrates high reliability in classifying major threat categories such as backdoor, cryptojacking, and none, achieving strong performance metrics. However, its effectiveness diminishes for lower-frequency attack classes. Aggressive scan, for instance, records a notably low F1-score of 3%, while synonymous IP flood reaches only 20%, indicating significant challenges in differentiating these threats. Similarly, service detection and SYN stealth exhibit poor classification performance, with F1-scores of 16% and 17%, respectively. Moderate performance is observed for port scan and push-ack flood, which attain F1-scores of 26% and 41%. The macro-average F1-score of 44% reflects substantial variability among classes, while the weighted average of 79% and an overall accuracy of 82% indicate that the model performs well for dominant categories but struggles with less frequent ones.
Performance of random forest algorithm
Figure 5 presents the confusion matrix and the classification report for the random forest model. The model classifies high-frequency threats, such as backdoor and cryptojacking, with exceptional accuracy. However, it struggles to differentiate threats with similar behavioral patterns. For instance, aggressive scan is frequently misclassified as backdoor or none due to overlapping network signatures. Similarly, flood-based threats, particularly the ICMP flood and ICMP fragmentation, exhibit misclassifications, suggesting limitations in feature representation for these categories. Additionally, lower-frequency threats, such as push-ack flood and synonymous IP flood, show higher misclassification rates, likely due to their under representation in the training data.
Figure 5: Confusion matrix for random forest algorithm.
Table 2 provides a detailed performance of the RF model in all threat categories. The model achieves near-perfect classification for high-frequency classes, with precision, recall, and F1-scores ranging from approximately 98% to 100% for backdoor, cryptojacking, and none. However, its ability to classify less frequent threats remains a challenge. Aggressive scan achieves an F1-score of only 7%, while the synonymous IP flood reaches 27%, indicating difficulty in distinguishing low-volume attacks. The lowest F1-scores are observed for service detection and SYN stealth, at 27% and 13%, respectively, highlighting the model’s struggles with subtle attack signatures. The model performs moderately on port scan and push-ack flood, with F1-scores of 37% and 57%, respectively. The macro-average F1-score of 44% reflects significant class imbalance, while the weighted average of 78% and an overall accuracy of 82% suggest that the model excels in recognizing dominant threats but lacks robustness for rare attack types. These results emphasize the need for enhanced feature engineering and dataset balancing to improve classification performance across all threat categories.
Performance of XGBoost algorithm
Figure 6 shows the classification performance of the XGBoost model in various network threats. The model excels in detecting high-frequency threats, achieving near-perfect accuracy for backdoor and cryptojacking. However, distinguishing closely related threats remains challenging. Aggressive scan is occasionally misclassified as OS fingerprinting or syn-stealth, likely due to similarities in their detection patterns. Minor misclassifications also occur among flood-based attacks, such as ICMP flood and ICMP fragmentation, suggesting overlapping characteristics. Lower-frequency threats, including push-ack flood and synonymous IP flood, exhibit occasional misclassifications, likely due to their limited representation in the dataset. Similarly, service detection shows some ambiguity, as its features closely resemble those of other scanning behaviors.
Figure 6: Confusion matrix of XGBoost algorithm.
Table 2 highlights the strong performance of the XGBoost model in key threat categories. High-frequency classes such as backdoor, cryptojacking, and none achieve near-perfect precision, recall, and F1-scores ranging from 95% to 100%. However, the model struggles with lower-frequency and overlapping categories. For instance, aggressive-scan reaches an F1-score of only 26%, while synonymous-ip-flood scores 49%. Additional classification challenges appear in service-detection and syn-stealth, which achieve F1-scores of just 34% and 21%, respectively. Despite these inconsistencies, the model maintains a strong overall performance, with a macro-average F1-score of 53%, a weighted average of 84%, and an overall accuracy of 84%.
Performance of VAE-XGBoost algorithm
Figure 7 demonstrates the robust classification performance of the VAE-XGBoost model, particularly for high-frequency threats such as backdoor and cryptojacking. However, distinguishing between closely related threats remains challenging. For example, aggressive scan is occasionally misclassified as os scan or port scan, likely due to similarities in their scanning behaviors. Similarly, flood-based threats like ICMP flood and ICMP fragmentation show misclassifications, suggesting that improved feature extraction could enhance their differentiation. Additionally, lower-frequency threats, such as push-ack flood, exhibit occasional misclassifications, likely due to the limited number of instances available for training.
Figure 7: Confusion matrix and classification report of VAE-XGboost algorithm.
Table 3 further highlights the model’s effectiveness in classifying key network threats. High-frequency categories, such as backdoor and cryptojacking, achieve near-perfect precision, recall, and F1-scores of approximately 99 to 100%. Meanwhile, ICMP fragmentation attains an F1-score of 81%, and the none category performs at a similar level of around 99%. However, classification remains challenging for less frequent categories, where lower F1-scores are observed, such as aggressive scan at 43%, vulnerability scan at 28%, and service detection and SYN stealth, both at approximately 46%. Despite variability across classes, as reflected in a macro-average F1-score of 65%, the model maintains strong overall performance, achieving a weighted average and accuracy of 89%.
| Attack type | VAE-XGBoost | ||
|---|---|---|---|
| Precision | Recall | F1 | |
| aggressive-scan | 0.40 | 0.46 | 0.43 |
| backdoor | 0.99 | 0.99 | 0.99 |
| cryptojacking | 1.00 | 1.00 | 1.00 |
| icmp-flood | 0.71 | 0.58 | 0.64 |
| icmp-fragmentation | 0.80 | 0.81 | 0.81 |
| none | 0.98 | 1.00 | 0.99 |
| os-fingerprinting | 0.19 | 0.22 | 0.20 |
| os-scan | 0.50 | 0.56 | 0.53 |
| port-scan | 0.64 | 0.47 | 0.54 |
| push-ack-flood | 0.67 | 0.76 | 0.72 |
| service-detection | 0.44 | 0.47 | 0.46 |
| syn-flood | 0.81 | 0.89 | 0.85 |
| syn-stealth | 0.46 | 0.46 | 0.46 |
| synonymous-ip-flood | 0.61 | 0.61 | 0.61 |
| tcp-flood | 0.79 | 0.73 | 0.76 |
| udp-flood | 0.84 | 0.72 | 0.78 |
| vuln-scan | 0.29 | 0.26 | 0.28 |
| Accuracy | 0.89 | ||
| Macro Avg | 0.66 | 0.65 | 0.65 |
| Weighted Avg | 0.89 | 0.89 | 0.89 |
Performance comparison with existing work
Table 4 highlights the superior performance of VAE-XGBoost, which outperforms both the models proposed by Buedi et al. (2024) and our own, achieving an impressive accuracy of 88.75%. This represents a significant improvement over the previous best accuracy of 78.87%, recorded by the Random Forest model in Buedi et al. (2024). Furthermore, VAE-XGBoost demonstrates consistently high performance in multiple evaluation metrics, with Precision, Recall, and F1-score all exceeding 88%. This underscores the effectiveness of VAE in extracting meaningful features while leveraging XGBoost’s robust classification capabilities. The synergy between these techniques not only enhances predictive accuracy but also ensures stability across diverse evaluation scenarios. The superior performance of VAE-XGBoost highlights its potential for broader applications in complex classification tasks, particularly in environments that require efficient feature representation and strong generalization ability. These comprehensive comparisons demonstrate our VAE-XGBoost model’s superior performance across all evaluation metrics, i.e., accuracy, precision, recall, F1-score while maintaining computational efficiency, a critical requirement for real-time EV charging network security. Table 4 presents a comparative performance analysis of deep learning models, i.e., LSTM, RNN, DNN, and Temporal Convolutional Network (TCN) from Benfarhat et al. (2025a) and is evaluated across 17 distinct attack classes. The proposed VAE-XGBoost model demonstrates competitive efficacy, achieving performance nearly equivalent to LSTM and superior to both RNN and DNN in multiple evaluation metrics. Although TCN yields the highest accuracy, this comes at a significant computational cost of 1.298 MFlops, highlighting the efficiency of our VAE-XGBoost approach as a more computationally lightweight alternative. Furthermore, experimental results confirm that VAE-XGBoost significantly outperforms KNN and RF in detecting cyber intrusions within charging networks, especially when dealing with highly dimensional and imbalanced data.
| Approach | Model | Accuracy (%) | Precision (%) | Recall (%) | F1-score (%) |
|---|---|---|---|---|---|
| Buedi et al. (2024) | Decision tree | 74.929 | 75.082 | 74.929 | 75.003 |
| KNN | 78.542 | 77.391 | 78.542 | 77.732 | |
| Adaboost | 58.326 | 56.894 | 58.325 | 56.953 | |
| MLP | 58.006 | 55.270 | 58.006 | 55.610 | |
| Naïve Bayes | 65.492 | 57.333 | 65.493 | 60.828 | |
| Logistic regression | 62.091 | 53.928 | 62.092 | 57.301 | |
| Random Forest | 78.873 | 77.831 | 78.873 | 78.223 | |
| SVM | 76.924 | 81.920 | 76.924 | 71.409 | |
| Benfarhat et al. (2025a) | LSTM | 89 | 91 | 89 | 89 |
| RNN | 82 | 84 | 82 | 81 | |
| DNN | 86 | 87 | 86 | 86 | |
| TCN | 93 | 93 | 93 | 93 | |
| Our approach | KNN | 82.094 | 79.660 | 82.094 | 79.273 |
| Random forest | 82.172 | 77.873 | 82.172 | 78.304 | |
| XGBoost | 84.179 | 83.789 | 84.179 | 83.912 | |
| Proposed | VAE-XGBoost | 88.745 | 88.730 | 88.745 | 88.668 |
This study presents VAE-XGBoost as a robust and scalable intrusion detection framework to secure the EV charging infrastructure. By integrating VAE advanced feature extraction with XGBoost high-performance classification, the model achieves superior detection accuracy, enhanced adaptability to emerging threats, and a notable reduction in false positives. Future research will focus on real-time deployment, scalability, and federated learning-based intrusion detection solutions to further strengthen cybersecurity in next-generation smart grids.
Conclusion
This study introduces a hybrid IDS specifically designed to protect the charging infrastructure of EV against evolving cyber threats. As EV adoption expands, vulnerabilities in charging networks pose significant risks to service availability, user privacy, and grid stability. Traditional IDS models struggle with class imbalance, adaptability, and real-time threat detection, limiting their effectiveness in dynamic environments. To address these challenges, we propose the VAE-XGBoost model, where VAE enhances feature extraction by reducing data dimensionality, and XGBoost improves classification accuracy, particularly for imbalanced attack scenarios. The model outperforms baseline approaches such as K-Nearest Neighbors and Random Forest using the CICEVSE2024 dataset, in key metrics including accuracy, precision, recall, and F1-score. Its ability to detect cyber threats and adapt to emerging attack patterns makes it a strong candidate to secure the next-generation EV charging infrastructure. However, challenges remain in distinguishing low-frequency and overlapping threats, as well as optimizing computational efficiency for large-scale deployments. Despite these limitations, VAE-XGBoost significantly improves cybersecurity in the next generation of EV charging networks, contributing to their secure and resilient growth.






