A combined model for short-term wind speed forecasting based on empirical mode decomposition, feature selection, support vector regression and cross-validated lasso

Tao Wang

doi:10.7717/peerj-cs.732

A combined model for short-term wind speed forecasting based on empirical mode decomposition, feature selection, support vector regression and cross-validated lasso

Tao Wang

Hefei University of Technology, Hefei, China

DOI: 10.7717/peerj-cs.732

Published: 2021-09-24
Accepted: 2021-09-09
Received: 2021-02-10

Academic Editor: Zhiwei Gao

Subject Areas: Data Mining and Machine Learning, Data Science
Keywords: Wind speed forecasting, Empirical mode decomposition, Feature selection, Support vector regression, Cross-validated lasso, Multi-step wind speed forecasting

Copyright: © 2021 Wang
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.

Cite this article: Wang T. 2021. A combined model for short-term wind speed forecasting based on empirical mode decomposition, feature selection, support vector regression and cross-validated lasso. PeerJ Computer Science 7:e732 https://doi.org/10.7717/peerj-cs.732

The author has chosen to make the review history of this article public.

Abstract

Background

The planning and control of wind power production rely heavily on short-term wind speed forecasting. Due to the non-linearity and non-stationarity of wind, it is difficult to carry out accurate modeling and prediction through traditional wind speed forecasting models.

Methods

In the paper, we combine empirical mode decomposition (EMD), feature selection (FS), support vector regression (SVR) and cross-validated lasso (LassoCV) to develop a new wind speed forecasting model, aiming to improve the prediction performance of wind speed. EMD is used to extract the intrinsic mode functions (IMFs) from the original wind speed time series to eliminate the non-stationarity in the time series. FS and SVR are combined to predict the high-frequency IMF obtained by EMD. LassoCV is used to complete the prediction of low-frequency IMF and trend.

Results

Data collected from two wind stations in Michigan, USA are adopted to test the proposed combined model. Experimental results show that in multi-step wind speed forecasting, compared with the classic individual and traditional EMD-based combined models, the proposed model has better prediction performance.

Conclusions

Through the proposed combined model, the wind speed forecast can be effectively improved.

Introduction

As a sustainable and renewable energy alternative to traditional fossil fuels, wind power has attracted widespread attention and rapid development in recent years (Hu et al , 2018). According to the statistical report of the Global Wind Energy Council, the world capacity is about 650.8 GW (Fu et al., 2020), of which the installed capacity in 2019 is 59.7 GW (Global Wind Energy Council, 2020). However, with the increase of grid-connected wind power, the stability of the power system will be challenged (Liu et al., 2018a). This is because wind power is closely related to the non-stationarity of wind speed. Accurate wind speed forecasting will provide support for wind power planning and control, and even help reduce the impact of unexpected events on the stability of the power system (Liu et al., 2018b). But due to the non-linearity and non-stationarity of wind, it is difficult to establish a satisfactory wind speed forecasting model. To this end, researchers have made great efforts to improve forecasting performance from different aspects, including basic predictive models, preprocessing methods, and combined or hybrid strategies.

For basic predictive models, a variety of methods has been presented, mainly including physical models, statistical models, and machine learning. The physical model usually uses physical parameters such as temperature and pressure to predict wind speed (Heng et al., 2016). Numerical Weather Prediction (NWP) is one of the representative technologies. However, due to the weak correlation between physical parameters and short-term wind speed, this type of model can only be used for medium- and long-term wind speed forecasting, not for short-term wind speed forecasting. In the short-term wind speed forecasting, the wind speed is generally predicted by analyzing the inherent laws of historical wind speed data (Chen et al., 2018; Liu et al., 2018b).

The statistical model is a method widely used in short-term wind speed forecasting, which uses historical data to predict wind speed. Commonly used statistical models have autoregressive (AR) (Lydia et al., 2016a), autoregressive moving average (ARMA) (Torres et al., 2005) and autoregressive integrated moving average (ARIMA) (Wang & Hu, 2015). Kavasseri & Seetharaman (2009) proposed an f-ARIMA model for wind speed forecasting, and claimed that compared with the persistence model, their model has significantly improved the prediction accuracy. Ait Maatallah et al. (2015) developed a Hammerstein autoregressive model to predict wind speed, and verified that their model has a better root mean square error (RMSE) than ARIMA and ANN. Poggi et al. (2003) developed a model to predict wind speeds of three Mediterranean sites in Corsica based on AR, and proved that the synthetic time series can retain the statistical characteristics of wind speeds. Also, Lydia et al. (2016b) presented a short-term wind speed forecasting model by combining linear AR and non-linear AR. In general, the statistical model is based on the linear assumption of data, while the wind speed series have non-linear characteristics, which makes those methods unable to effectively deal with the non-linear characteristics of wind.

To solve the problem, machine learning is introduced by researchers to predict wind speed. Normally, machine learning is used as a predictive model or parameter optimization, mainly includes the evolutionary algorithm, extreme learning machine (ELM) algorithm, ANN algorithm and SVM algorithm. Wang (2017) presented a wind speed forecasting model by combining SVM and particle swarm optimization (PSO). Zhang et al. (2019) combined online sequential outlier robust ELM with hybrid mode decomposition (HMD) to predict wind speed. Wang, Li & Bai (2018) developed an error correction-based ELM model for short-term wind speed forecasting. Liu et al. (2020) introduced the Jaya-SVM (Jaya algorithm-based support vector machine) into wind speed forecasting. Krishnaveny et al. (Nair, Vanitha & Jisma, 2017) exploited the performance of three different models, i.e., ANN, ARIMA and hybrid model, in wind speed forecasting. Azeem et al. (2018) investigated the KNN-based and ANN-based models for wind speed forecasting. Recently, deep learning, a new branch of machine learning, has received extensive attention. It has been widely used for regression and classification problems. According to the literature, deep learning can abstract the hidden structure and inherent characteristics of data compared with shallow methods. Khodayar & Wang (2019) introduced a scalable graph convolutional deep learning (GCDLA) for wind speed forecasting. Wang et al. (2016a) investigated a deep belief network model for wind speed forecasting. Khodayar & Wang (2019) combined rough set theory and restricted Boltzmann machines presented a wind speed forecasting. Hong & Satriani (2020) based on a convolutional neural network developed a day-ahead wind speed forecasting model. Although researchers claim that deep learning can achieve better performance, these methods are computationally intensive and prone to overfitting on small data sets.

In addition to these basic forecasting models, preprocessing methods such as feature selection (FS) are also introduced in wind speed forecasting. This is because in short-term wind speed forecasting, the lag of historical wind speed is usually used as the feature, which may lead to a certain degree of redundancy. FS is used to select the best input for the basic predictive model, so that the model can obtain better generalization performance (Li et al., 2018a). For example: Paramasivan & Lopez (2016) employed a ReliefF feature selection algorithm to identify key features, and then used a bagging neural network to predict the wind speed. Niu et al. (2018) presented a multi-step wind speed forecasting model using optimal FS, modified bat algorithm and cognition strategy. Botha & Walt (2017) combined FS with SVM to predict short-term wind speed. Kong et al. (2015) combined feature selection and reduced support vector machines (RSVM) for wind speed forecasting.

Due to the unstable nature of wind, the model of combined- or hybrid-signal processing technology has become the mainstream of wind speed forecasting. Wherein the signal processing technology is usually employed to decompose the wind speed to reduce or eliminate the instability. Commonly used signal processing techniques have empirical mode decomposition (EMD), variational mode decomposition (VMD) and wavelet transform (WT). Wang et al. (2016b) decomposed wind speed into stable signals using ensemble empirical mode decomposition (EEMD). Sun & Wang (2018) developed a fast ensemble empirical mode decomposition model to improve the accuracy of wind speed forecasting. Tascikaraoglu et al. (2016) based on WT proposed a wind speed forecasting model. Hu & Wang (2015) adopted an empirical wavelet transform (EWT) to extract key information in wind speed time series. Yu, Li & Zhang (2017) explored the performance of EMD, EEMD and complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) in wind speed forecasting.

In the field of wind speed forecasting, there are mainly three forecast scenarios: short-term forecasting, medium-term forecasting and long-term forecasting. Among them, short-term wind speed forecasting is essential for estimating power generation, and it is difficult to predict accurately due to the nonlinearity and instability of wind speed. Therefore, in the study, we tried to develop a new model to forecast short-term wind speed. The originality of this model is to propose a combined model of EMD, FS, SVR and Cross-validated Lasso (LassoCV) for multi-step wind speed forecasting. The framework of our study is as follows: (a) EMD is used to extract the intrinsic mode functions (IMFs) from the original wind speed time series; (b) FS and SVR are combined to predict high-frequency IMF; (c) LassoCV is used to complete the prediction of low-frequency IMF and trend.

The main contributions of the research are as follows:

A novel model based on EMD, FS, SVR and LassoCV is proposed to improve the accuracy of multi-step wind speed forecasting, where EMD is used to extract IMFs from the original wind speed data to reduce the non-stationarity of wind speed.
Based on the principle of EMD, the first IMF component decomposed by EMD contains most of the high-frequency information, and an algorithm with good generalization performance is usually required for prediction. We combine FS and SVR to predict the high-frequency IMF (i.e., the first IMF) component.
Compared with the first IMF component, the frequency of the other IMF components decomposed by EMD is much lower and presents a Sin-like curve. Linear regression usually gets better performance. We introduce LassoCV to complete the prediction of low-frequency IMFs and trend.

The paper is as follows: The framework of the proposed model and the principles involved are introduced in ‘Methods’. ‘Results’ describes the experimental data used in the paper, and the comparison with the classic individual models. ‘Discussion’ discusses the effectiveness of EMD. ‘Conclusion’ concludes the study.

Methods

The whole process of the proposed model

The architecture of our proposed model is shown in Fig. 1. The whole process is as follows:

Use EMD to decompose wind speed into a series of IMFs. EMD algorithm is introduced in ‘Empirical model decomposition’
Combine FS and SVR to predict the high-frequency IMF obtained by EMD. FS and SVR algorithms are provided in ‘Feature selection’ and ‘Support vector regression’, respectively.
Use LassoCV to complete the prediction of the low-frequency IMF and trend. LassoCV algorithm is listed in ‘Cross-validated lasso’.
Performance evaluation. The performance indicators are introduced in ‘Prediction performance criteria’, and the experimental results and analysis are given in ‘Results’ and ‘Discussion’.

Empirical model decomposition

Due to the non-stationarity, intermittent and inherent nature of wind speed, it is difficult to directly predict the future wind speed. One possible solution is to decompose different frequencies from chaotic wind data (Bokde et al., 2019) and use models to predict them separately. Based on this idea, the study introduces signal processing technology to decompose wind speed. Common signal decomposition algorithms include Wavelet transform, morphology filters, EMD and many others. Wavelet transform is not adaptive and follows the prior knowledge of its mother wavelet, so somewhat limits its ability to extract nonlinear and non-stationary components from the data. Similarly, the morphology filters have to select the shape and the length of the structural element. There is no uniform standard and depends on human experience, whereas EMD has received great attention from researchers because of its superior performance and easy-to-understand. Therefore, in this study, we used EMD for preprocessing the wind speed.

EMD is essentially a non-linear signal analysis method that can handle non-linear and non-stationary time series (Huang et al., 1998). EMD uses the time-scale characteristics of the data to decompose the signal, and does not need to set any basis functions in advance. In theory, EMD can be applied to any type of signal. Since EMD was proposed, it has been rapidly applied to many different engineering fields such as marine and atmospheric research, seismic record analysis and mechanical fault diagnosis (Gao & Liu, 2021).

The basic idea of EMD is to decompose non-stationary time series signals into a series of IMFs along with a residue (Huang et al., 1998). The IMF should meet two principles: (1) the number of extreme and zero values must be equal or differ by at most one; (2) the average value of upper envelop and lower envelope must be zero (Ziqiang & Puthusserypady, 2007). Let $s (t),$ t =1 , 2, …, l be a time series. EMD decomposition steps are as follows:

Step 1: Identify the local minima and maxima of the time series.

Step 2: Use cubic splines to interpolate local minima and maxima values to generate lower $s_{l} (t)$ and upper $s_{u} (t)$ .

Step 3: Computer the average envelope of the upper and lower envelopes $m_{t} = \frac{s_{u} (t) + s_{l} (t)}{2}$

Step 4: Subtract the average envelope from the original time series $h (t) = s (t) - m_{t}$

Step 5: Check $h (t)$ if meets the two principles of IMF. If so, treat $h (t)$ as the new IMF $c (t)$ and calculate the residual signal $r (t) = s (t) - h (t)$ . Otherwise, replace $h (t)$ with $s (t)$ , and then repeat steps 1 to 5.

Step 6: Set $r (t)$ as new $s (t)$ and repeat steps 1 to 5 until all IMFs are obtained.

Through the whole process, a set of IMFs from high to low frequency can be extracted from the time series. Therefore, the original time series can be expressed as: $s (t) = \sum_{i = 1}^{n} c_{i} (t) + r_{n} (t)$ where n is the number of IMFs. $c_{i} (t)$ refers to the IMF, which is periodic and almost orthogonal to each other (Li et al., 2018b). $r_{n} (t)$ is the final residual representing the trend of $s (t)$ .

Feature selection

After obtaining the IMF components of wind speed, we need to predict it. In the study, we use the observed and lag of the IMF components as the raw features, respectively forecast each IMF component, and add all the predicted IMF components to get the final wind speed. Despite, the raw features contain sufficient information for forecasting, some irrelevant or partially relevant features in the raw features may have a negative impact on the model. To avoid the impact, a common strategy is to use feature selection to remove irrelevant features. Commonly used feature selection algorithms include filter method, wrapper method, heuristic search algorithm, embedded method (Chandrashekar & Sahin, 2014). In this study, we use the filter method. In order to obtain scores of different variables, we use the univariate linear regression test to calculate the correlation between features and output (Liu et al., 2019b), which is defined as: $C o r_{i} = \frac{(X [:, i] - m e a n (X [:, i])) * (y - m e a n (y))}{s t d (X [:, i]) * s t d (y)}$ where X is an N × M matrix, each column is a feature. y is the N × 1 vector of the output we are interested in. Based on the rank of correlation, the irrelevant or partially relevant features are removed.

Support vector regression

The support vector machine (SVM) is a learning method based on structural risk minimization criteria, which can minimize the expected risk and obtain better generalization performance on unknown data. The support vector regression (SVR) is an extension of SVM for regression problems (Drucker et al., 1997). Due to the nonlinear and non-stationary nature of wind speed, SVR is widely used in short-term wind speed forecasting (Khosravi et al., 2018; Liu et al., 2019a; Santamaría-Bonfil, Reyes-Ballesteros & Gershenson, 2016). In the research, we use EMD to decompose the IMF components of wind speed, and the high-frequency IMF component contains the nonlinear and non-stationary part of wind speed. In order to obtain better generalization performance, we refer to existing research and use SVR to predict it.

The main idea of SVR is to implement linear regression in the high-dimensional feature space obtained by mapping the original input through a predefined function $\emptyset (x)$ , and to minimize structure risks (Chen et al., 2018). Given a set of samples $\{x_{i}, y_{i}\},$ i =1 , 2, …, N, y_i is the output and x_i is the input. The objective is: $\begin{matrix} f (x) = W^{T} \emptyset (x) + b \\ R [f] = \frac{1}{2} {∥W∥}^{2} + C \sum_{i = 1}^{N} L (x_{i}, y_{i}, f (x_{i})) \end{matrix}$ where W and b are the regression coefficient and bias, respectively. C is the penalty coefficient. $L (x_{i}, y_{i}, f (x_{i}))$ represents the loss function, and $R [f]$ is the structure risk. The corresponding constrained optimization problem can be expressed as: $\begin{matrix} \min \frac{1}{2} {∥W∥}^{2} + C \sum_{i = 1}^{N} (ξ_{i} + ξ_{i}^{*}) \\ \begin{matrix} s . t . & y_{i} - W^{T} ϕ (x) - b \leq ɛ + ξ_{i} \\ W^{T} ϕ (x) + b - y_{i} \leq ɛ + ξ_{i}^{*} \\ ξ_{i}, ξ_{i}^{*} \geq 0, i = 1, 2, \dots, n \end{matrix} \end{matrix}$ where ξ_i and $ξ_{i}^{*}$ refer to the slack variables. By introducing the Lagrange multiplier, the regression can be expressed as: $f (x) = \sum_{i = 1}^{N} (α_{i} - α_{i}^{*}) K (x_{i}, x) + b$ where α_i and $α_{i}^{*}$ are the Lagrange multipliers that satisfy the conditions $α_{i} \geq 0, α_{i}^{*} \geq 0$ and $\sum_{i = 1}^{N} (α_{i} - α_{i}^{*}) = 0 . K (x_{i}, x)$ is the kernel function conforming to Mercer’s theorem.

Cross-validated lasso

The Lasso algorithm is a regression model that can perform feature selection and regularization at the same time. It was originally proposed by Robert Tibshirani of Stanford University, with better prediction accuracy and interpretability (Tibshirani, 1996). Normally, in regression, we want to find a coefficient $β = (β_{1}, \dots, β_{p})$ that satisfies the following: $Y = X β + ɛ, E [ɛ | X] = 0$ where Y is the dependent variable, $X = (X_{1}, \dots, X_{N})$ is the covariate, and ɛ is the unobserved noise. Lasso tries to minimize the objective function while forcing the sum of the absolute values of the coefficients to be less than a fixed value t (Hung, Yen & Li, 2016): ${min}_{β_{0}, β} \{\frac{1}{N} \sum_{i = 1}^{N} {(y_{i} - β_{0} - x_{i}^{T} β)}^{2}\}$ $s . t . \sum_{j = 1}^{p} |β_{j}| \leq t .$

Rewritten in the Lagrangian form: ${\hat{β}}_{l a s s o} = \underset{β \in R^{p}}{argmin} \{\frac{1}{N} {∥y - X β∥}_{2}^{2} + λ {∥β∥}_{1}\}$

The L₁-norm is used instead of the L₂-norm in Lasso. Since the constraint region is diamond-shaped, it is more likely to pick the solution that lies at the corner of the region. As a result, the solution of the lasso is sparse, with some coefficients set to exactly equal to zero, that is, Lasso performs a straightforward feature selection.

To estimate ${\hat{β}}_{l a s s o}$ , the value of the penalty parameter λ is critically important. However, the optimal λ is not given automatically. If λ is chosen appropriately, Lasso achieves the fast convergence under fairly general conditions; On the other hand (chosen inappropriately), Lasso may be inconsistent or have a slower convergence. In the paper, we adopt the cross-validated Lasso algorithm, in which the penalty parameter λ is chosen based on cross-validation, and this is also the leading recommendation way in the theoretical literature (Park & Casella, 2008).

Prediction performance criteria

In the study the mean absolute percentage error (MAPE) , mean absolute error (MAE) and RMSE are used as performance indicators to evaluate the proposed wind forecasting model, which are defined as follows:

$M A P E = \frac{1}{N} \sum_{i = 1}^{N} |(Y_{i} - {\hat{Y}}_{i}) / Y_{i}|$ $M A E = \frac{1}{N} \sum_{i = 1}^{N} |Y_{i} - {\hat{Y}}_{i}|$ $R M S E = \sqrt{\frac{1}{N - 1} \sum_{i = 1}^{N} {(Y_{i} - {\hat{Y}}_{i})}^{2}}$ where Y_i and ${\hat{Y}}_{i}$ refer to the observed and predicted wind speed of data point i, respectively. For MAPE, MAE, RMSE, the smaller value, the better the performance.

Results

Wind speed data

The wind speed data used in the study is gathered from two wind stations in Michigan, USA from September 2019 to October 2019. The number of data is 1,464. The initial 50 days from September 1, 2019 to October 20, 2019 are employed as input for model training, and the remaining days, i.e., from October 21, 2019 to October 31, 2019 are used to test. Figure 2 shows these two wind speed time series, and the corresponding statistics are listed in Table 1.

Figure 2: Wind speed collected from wind stations #1 and #2.

Download full-size image

DOI: 10.7717/peerjcs.732/fig-2

Table 1:

Wind speed statistics at wind stations #1 and #2.

Wind station	Dataset	Date	Statistical indicators
			Mean (m/s)	Max (m/s)	Min (m/s)	Std.	Stew.	Kurt.
Site #1	Training set	Sept. 1, 2019 ∼ Oct. 20, 2019 (∼83%)	3.2975	14.4	0	2.378	0.871	0.865
Site #1	Testing set	Oct. 21, 2019 ∼ Oct. 31, 2019 (∼17%)	3.1614	13.9	0	2.486	1.108	1.312
Site #2	Training set	Sept. 1, 2019 ∼ Oct. 20, 2020 (∼83%)	3.6919	11.3	0	2.183	0.807	0.353
Site #2	Testing set	Oct. 21, 2019 ∼ Oct. 31, 2020 (∼17%)	3.5667	9.3	0	2.118	0.500	−0.318

DOI: 10.7717/peerjcs.732/table-1

Experiments and result analysis

To verify the effectiveness of the proposed model, we compare it with five classic individual models, including Persistence, ELM, SVR and ANN, ARIMA. The 1- to 3-step forecasting results of these models under time series #1 and #2 are displayed in Figs. 3–4, and the corresponding error estimated results are listed in Tables 2–5. It is worth noting that for a fair comparison, the parameters of the involved models are selected based on cross-validation. Based on the experimental results, we can get the following conclusions:

Figure 3: The prediction of the classic individual models at wind station #1.

Download full-size image

DOI: 10.7717/peerjcs.732/fig-3

Figure 4: The prediction of the classic individual models at wind station #2.

Download full-size image

DOI: 10.7717/peerjcs.732/fig-4

Table 2:

The error result of the classic individual models at wind station #1.

Models	1-step			2-step			3-step
	RMSE	MAE	MAPE (%)	RMSE	MAE	MAPE (%)	RMSE	MAE	MAPE (%)
Persistence	1.1892	0.8996	36.20	1.5892	1.2221	49.65	1.9008	1.4687	57.64
ARIMA	1.1724	0.9010	34.25	1.5182	1.1561	45.25	1.7647	1.3569	53.79
ELM	1.2705	0.9724	36.20	1.5500	1.1729	46.55	1.8109	1.3603	55.18
SVR	1.1739	0.9024	34.87	1.5676	1.1928	46.71	1.7832	1.3376	52.78
ANN	1.1984	0.9354	36.24	1.5338	1.1615	45.79	1.8427	1.3906	55.70
The proposed	0.5859	0.4426	21.11	0.7531	0.5848	24.78	0.8528	0.6798	27.55

DOI: 10.7717/peerjcs.732/table-2

Table 3:

The error result of the classic individual models at wind station #2.

Models	1-step			2-step			3-step
	RMSE	MAE	MAPE (%)	RMSE	MAE	MAPE (%)	RMSE	MAE	MAPE (%)
Persistence	1.2720	0.9739	35.98	1.4292	1.0947	41.02	1.6700	1.3073	47.99
ARIMA	1.1609	0.9302	38.71	1.3214	1.0430	45.43	1.5257	1.2188	53.05
ELM	1.2528	1.0188	44.81	1.3657	1.0915	51.40	1.5867	1.2849	60.12
SVR	1.1602	0.9218	36.51	1.3115	1.0360	43.63	1.5018	1.2008	49.91
ANN	1.1901	0.9460	40.62	1.3116	1.0330	43.72	1.6345	1.2798	52.18
The proposed	0.5593	0.4193	17.10	0.7540	0.5966	22.99	0.7911	0.6437	24.59

DOI: 10.7717/peerjcs.732/table-3

Table 4:

The improvement rate of the proposed model relative to the classic individual models at wind station #1.

Models		1-step	2-step	3-step
Persistence	P_RMSE (%)	102.98	111.04	122.89
	P_MAE (%)	103.24	108.97	116.05
	P_MAPE (%)	71.47	100.34	109.26
ARIMA	P_RMSE (%)	100.11	101.60	106.94
	P_MAE (%)	103.55	97.69	99.60
	P_MAPE (%)	62.20	82.58	95.26
ELM	P_RMSE (%)	116.85	105.83	112.35
	P_MAE (%)	119.68	100.57	100.11
	P_MAPE (%)	71.44	87.82	100.31
SVR	P_RMSE (%)	100.36	108.16	109.11
	P_MAE (%)	103.87	103.96	96.76
	P_MAPE (%)	65.15	88.48	91.62
ANN	P_RMSE (%)	104.54	103.68	116.09
	P_MAE (%)	111.33	98.62	104.56
	P_MAPE (%)	71.63	84.78	102.23

DOI: 10.7717/peerjcs.732/table-4

Table 5:

The improvement rate of the proposed model relative to the classic individual models at wind station #2.

Models		1-step	2-step	3-step
Persistence	P_RMSE (%)	127.42	89.54	111.11
	P_MAE (%)	132.24	83.49	103.08
	P_MAPE (%)	110.33	78.38	95.19
ARIMA	P_RMSE (%)	107.55	75.25	92.86
	P_MAE (%)	121.83	74.83	89.35
	P_MAPE (%)	126.31	97.56	115.77
ELM	P_RMSE (%)	123.99	81.12	100.59
	P_MAE (%)	142.95	82.96	99.60
	P_MAPE (%)	161.98	123.54	144.54
SVR	P_RMSE (%)	107.43	73.93	89.84
	P_MAE (%)	119.81	73.66	86.54
	P_MAPE (%)	113.43	89.76	103.00
ANN	P_RMSE (%)	112.78	73.95	106.62
	P_MAE (%)	125.58	73.16	98.82
	P_MAPE (%)	137.50	90.12	112.23

DOI: 10.7717/peerjcs.732/table-5

In the 1-step forecasting, for wind station #1, the proposed model obtains the best accuracy: RMSE, MAE, and MAPE are 0.5859, 0.4426, and 21.11%, respectively. The classic individual models from low to high based on RMSE are ELM, ANN, Persistence, SVR, and ARIMA, with MAPE values of 36.20%, 36.24%, 36.20%, 34.87%, and 34.25%, respectively. Likely, in wind station #2, compared with the classic individual models, the proposed model still obtains the best performance, and the MAPE value is 17.10%.
In the 2-step forecasting, when wind station #1 is used, the proposed model has the lowest performance criteria, i.e., the values of RMSE, MAE, and MAPE are 0.7531, 0.5848, and 24.78%, respectively. In addition, for wind station #2, the proposed model still achieves the lowest performance criteria value. Take MAPE as an example, the value of MAPE is 22.99%, which is significantly lower than other models.
In the 3-step forecasting, the proposed model is still the model with the highest prediction accuracy, and the MAPE of wind stations #1 and #2 are 27.55% and 24.59%, respectively. Persistence has the worst RMSE value among these models, with MAPE of 57.64% and 47.99%, respectively.

In general, under 1- to 3-step forecasting, the proposed model can obtain the best prediction performance compared with the classic individual models.

Compared with traditional EMD methods

As a nonlinear signal analysis method for processing nonlinear and non-stationary time series, EMD has been widely used in time series. To further verify the effectiveness of our EMD model, we compare it with four widely used EMD models, namely EMD-ELM, EMD-SVR, EMD-SP-SVR, and EMD-ANN. It is worth noting that in this study, these methods used the same way as our proposed model, using EMD to decompose the wind speed, using a single classifier to predict each IMF component separately, and adding all the prediction results to get the final prediction wind speed. The prediction results and the error estimated results of these four EMD-based methods and the proposed method are displayed in Figs. 5–6 and Tables 6–9. Based on Figs. 5–6 and Tables 6–9, it can be observed that:

Figure 5: The prediction of different combination models at wind station #1.

Download full-size image

DOI: 10.7717/peerjcs.732/fig-5

Figure 6: The prediction of different combination models at wind station #2.

Download full-size image

DOI: 10.7717/peerjcs.732/fig-6

Table 6:

The error result of different combination models at wind station #1.

Models	1-step			2-step			3-step
	RMSE	MAE	MAPE (%)	RMSE	MAE	MAPE (%)	RMSE	MAE	MAPE (%)
EMD-ELM	0.6400	0.5128	22.63	0.7854	0.6316	27.22	0.8746	0.6937	29.02
EMD-SVR	0.6379	0.5120	23.32	0.7768	0.6181	27.09	0.8583	0.6749	28.48
EMD-SVR-SP	0.6310	0.4867	23.03	0.7987	0.6141	26.30	0.8591	0.6762	28.66
EMD-ANN	0.6342	0.5055	23.55	0.7879	0.6221	27.67	0.8987	0.7040	29.31
The proposed	0.5859	0.4426	21.11	0.7531	0.5848	24.78	0.8528	0.6798	27.55

DOI: 10.7717/peerjcs.732/table-6

Table 7:

The error result of different combination models at wind station #2.

Models	1-step			2-step			3-step
	RMSE	MAE	MAPE (%)	RMSE	MAE	MAPE (%)	RMSE	MAE	MAPE (%)
EMD-ELM	0.6560	0.5283	21.59	0.8199	0.6669	27.49	0.8775	0.7096	27.65
EMD-SVR	0.6567	0.5233	24.88	0.8317	0.6736	29.85	0.8508	0.6986	30.52
EMD-SVR-SP	0.6437	0.4972	24.06	0.8211	0.6718	28.53	0.8894	0.7264	32.31
EMD-ANN	0.6397	0.5046	21.83	0.7927	0.6373	25.34	0.8520	0.6934	27.86
The proposed	0.5593	0.4193	17.10	0.7540	0.5966	22.99	0.7911	0.6437	24.59

DOI: 10.7717/peerjcs.732/table-7

Table 8:

The improvement rate of the proposed model relative to other combined models at wind station #1.

Models		1-step	2-step	3-step
EMD-ELM	P_RMSE (%)	9.23	4.30	2.56
	P_MAE (%)	15.85	8.00	2.05
	P_MAPE (%)	7.20	9.82	5.36
EMD-SVR	P_RMSE (%)	8.88	3.16	0.65
	P_MAE (%)	15.67	5.69	−0.72
	P_MAPE (%)	10.43	9.31	3.41
EMD-SVR-SP	P_RMSE (%)	7.70	6.06	0.74
	P_MAE (%)	9.95	5.01	−0.53
	P_MAPE (%)	9.09	6.10	4.05
EMD-ANN	P_RMSE (%)	8.25	4.63	5.39
	P_MAE (%)	14.21	6.38	3.56
	P_MAPE (%)	11.52	11.67	6.40

DOI: 10.7717/peerjcs.732/table-8

Table 9:

The improvement rate of the proposed model relative to other combined models at wind station #2.

Models		1-step	2-step	3-step
EMD-ELM	P_RMSE (%)	17.29	8.74	10.93
	P_MAE (%)	25.98	11.78	10.23
	P_MAPE (%)	26.20	19.55	12.46
EMD-SVR	P_RMSE (%)	17.41	10.30	7.56
	P_MAE (%)	24.80	12.91	8.52
	P_MAPE (%)	45.48	29.81	24.15
EMD-SVR-SP	P_RMSE (%)	15.09	8.90	12.43
	P_MAE (%)	18.58	12.61	12.84
	P_MAPE (%)	40.64	24.08	31.42
EMD-ANN	P_RMSE (%)	14.37	5.12	7.71
	P_MAE (%)	20.33	6.83	7.72
	P_MAPE (%)	27.62	10.22	13.29

DOI: 10.7717/peerjcs.732/table-9

Compared with the above-mentioned classic individual models, the performance of the EMD-based method is significantly improved. Take wind station #1 as an example, in the 1-step forecasting, the value of RMSE of the EMD-based methods is around 0.60, while the classic individual model is around 1.20. After the wind speed is decomposed by EMD, the value of RMSE is reduced almost doubled.
For wind station #1, except for the MAE in the 3-step forecasting, the performance indicators obtained from the proposed model are significantly better than those EMD-based combined models. For the 3-step forecasting, the performance of EMD-SVR and EMD-SVR-SP in MAE is slightly better than the proposed combined model, but in other evaluation indicators, the proposed combined model achieves a significantly better performance. Furthermore, EMD-ANN is always worse in MAPE as compared with the other three combined models, with MAPE of 23.55%, 27.67%, and 29.31% for 1- to 3-step forecasting.
For wind station #2, in 1- to 3-step wind speed forecasting, the proposed combined model obtains the best prediction results. The RMSE, MAE and MAPE in the 1-step forecasting are 0.5593, 0.419, and 17.10%, respectively. In comparison, among the other four EMD-based combined models, the EMD-ELM and EMD-ANN models have similar prediction performance in 1- to 3-step forecasting, with MAPE values of 21.59%, 27.49%, 27.65% and 21.83%, 25.3%, 27.86%, respectively.

In total, the EMD-based method has obvious advantages over traditional methods, and the proposed method that using EMD, FS, SVR and LassoCV can achieve better performance.

Discussion

Performance of SVR-SP and LassoCV on different IMFs

According to the EMD principle, the frequency of the IMF components is from high to low. The non-linear and non-stationary information of wind speed data is mainly concentrated in the high-frequency IMF, and the low-frequency IMF presents a Sin-like function curve. Based on its characteristics, in this study we use SVR-SP and LassoCV to predict IMFs of different frequencies. In order to verify the effectiveness of this hybrid EMD model, in this section, we take wind station #2 as an example to analyze the performance of the two methods on different IMF components. Table 10 lists the RMSE of SVR-SP and LassoCV on different IMF components. It is worth mentioning that in multi-step prediction, the prediction accuracy of the first step is more important than the other steps, which is of great significance for the accurate estimation of wind power. It can be seen from Table 10 that SVR-SP can obtain significantly better performance than LassoCV at high frequency (IMF1), while LassoCV can obtain better performance at low frequencies (IMF2∼IMF7, Trend), and its RMSE is already close to zero at IMF4. Moreover, SVR-SP has a risk of overfitting when predicting low frequencies, resulting in poor performance. In total, the proposed model that combines the EMD decomposition characteristics and the advantages of the algorithm can achieve better performance than the traditional EMD model.

Table 10:

The RMSE of SVR-SP and LassoCV on different IMF components at wind station #2.

Steps	Models	IMF1	IMF2	IMF3	IMF4	IMF5	IMF6	IMF7	Trend
1-step	SVR-SP	0.530	0.256	0.061	0.047	0.042	0.040	0.326	0.100
1-step	LassoCV	0.594	0.178	0.033	0.002	0.001	0.000	0.000	0.000
2-step	SVR-SP	0.670	0.407	0.198	0.067	0.041	0.042	0.327	0.100
2-step	LassoCV	0.662	0.369	0.121	0.009	0.001	0.000	0.001	0.000
3-step	SVR-SP	0.668	0.422	0.354	0.086	0.046	0.045	0.327	0.100
3-step	LassoCV	0.663	0.401	0.262	0.023	0.002	0.001	0.001	0.000

DOI: 10.7717/peerjcs.732/table-10

Comparison of different signal decomposition techniques

Besides EMD, Variational Mode Decomposition (VMD) and Ensemble Empirical Mode Decomposition (EEMD) are also widely used in short-term wind speed forecasting. Here, we analyze the impact of different signal decomposition techniques on the performance of our proposed method. Table 11 shows the prediction performance of the three signal decomposition techniques on two wind stations. For wind station #1, it can be found that compared with VMD and EEMD, EMD obtains the best RMSE value in the 1-step forecasting. The performance obtained by VMD in the 1-step and 2-step forecasting is relatively close, but it drops significantly in the 3-step forecasting. EEMD inherits from EMD, similar to EMD, as the step size increases, the performance will decrease significantly. For wind station #2, EMD also obtained the best predictive performance. VMD has a similar conclusion on wind station #1, and the performance of the 1-step and 2-step forecasting is relatively close. It should be pointed out that in multi-step forecasting, the 1-step forecasting is usually used for wind energy estimation, and other steps are used to assist decision-making, so more attention is paid to the performance of the 1-step forecasting.

Table 11:

The RMSE of VMD, EEMD and EMD at wind stations #1 and #2.

Wind station	Signal decomposition method	RMSE
		1-step	2-step	3-step
Site #1	VMD	0.6395	0.6782	0.7793
	EEMD	0.6358	0.7301	0.8277
	EMD (The proposed)	0.5859	0.7531	0.8528
Site #2	VMD	0.6664	0.6654	0.7111
	EEMD	0.5844	0.8404	0.8758
	EMD (The proposed)	0.5593	0.7540	0.7911

DOI: 10.7717/peerjcs.732/table-11

The impact of the number of selected features on performance

Feature selection is used to remove redundant features in the study. However, the number of selected significant features will more or less affect the short-term wind speed forecasting. In order to ensure the stability in the complicated industrial system, we analyzed the performance of our proposed method under the different number of selected features. Figure 7 shows the RMSE value between the number of selected features and the performance of our proposed method. It should be pointed out that in the study based on the characteristics of EMD decomposition we use FS and SVR to predict high-frequency component (i.e., IMF₁), and use LassoCV to predict low-frequency components. Feature selection is mainly used in the prediction of IMF₁ component. From Fig. 7, we can be seen that feature selection can slightly improve the performance of 1-step forecasting, but has little effect on 1-step and 2-step forecasting. Overall, as the number of selected features decreases, the generalization performance of the method will improve, but when the selected features are too scarce, the performance will drop sharply due to the deletion of useful features. In order to determine the appropriate number of features, by following (Bradley, Mangasarian & Street, 1998; Chizi, Rokach & Maimon, 2009) , this study uses cross-validation to select.

Figure 7: The RMSE between the number of selected features and the performance of the proposed method.

Download full-size image

DOI: 10.7717/peerjcs.732/fig-7

Performance under different signal-to-noise ratios

In the process of collecting wind speed, it is often affected by the environment and the anemometer itself, resulting in a certain amount of noise in the data. In order to verify the reliability of the method, we analyzed the prediction performance under different signal-to-noise ratios (SNRs). Figure 8 shows the 1-step to 3-step prediction performance of the method from 30∼60db SNR. Take wind station #1 as an example, it can be seen from Fig. 8 that the performance of the proposed method is relatively stable under different signal-to-noise ratios. The RMSE value of 1-step forecasting is about 0.6, the RMSE value of 2-step forecasting is about 0.75, and the RMSE value of 3-step forecasting is about 0.85. In general, as the signal-to-noise ratio increases, the prediction performance of the proposed method will be improved. Similar performance also exists on site #2. These experimental results show that the proposed method can accurately predict wind speed under certain noise.

Figure 8: The RMSE of the proposed method at 30 60db SNR.

Download full-size image

DOI: 10.7717/peerjcs.732/fig-8

Conclusions

As a sustainable and renewable energy, wind power has attracted widespread attention and rapid development in recent years. Reliable and accurate wind speed forecasting will provide support for wind power planning and control. Due to the non-linearity and non-stationarity of wind, forecasting is still a difficult yet challenging problem. In the paper, we developed a new wind speed forecasting model based on EMD, FS, SVR and LassoCV. EMD is employed to extract IMFs from the original non-stationary wind speed time series. FS and SVR are combined to predict the high-frequency IMF. LassoCV is adopted to complete the prediction of low-frequency IMF and trend. By testing in two wind speeds obtained from Michigan, USA, the experimental results show that under 1- to 3-step forecasting the proposed model can achieve better prediction performance than the classic individual and traditional EMD combined models. Although the proposed model has achieved good performance, it still has some limitations. After the new data is updated, the model needs to be retrained. In future research, we will try to integrate online learning in our proposed method.

[1] Ait Maatallah O, Achuthan A, Janoyan K, Marzocca P. 2015. Recursive wind speed forecasting based on Hammerstein Auto-Regressive model. Applied Energy 145:191-197

[2] Azeem A, Fatema N, Malik HJ, Jo I, Systems F. 2018. k-NN and ANN based deterministic and probabilistic wind speed forecasting intelligent approach. Journal of Intelligent & Fuzzy Systems 35(5):5021-5031

[3] Bokde N, Feijóo A, Villanueva D, Kulat KJE. 2019. A review on hybrid empirical mode decomposition models for wind speed and wind power prediction. Energies 12(2):254

[4] Botha N, Walt CMvd. 2017. Forecasting wind speed using support vector regression and feature selection. In: 2017 Pattern Recognition Association of South Africa and Robotics and Mechatronics (PRASA-RobMech). 181-186

[5] Bradley PS, Mangasarian OL, Street WN. 1998. Feature Selection via Mathematical Programming. INFORMS Journal on Computing 10:209-217

[6] Chandrashekar G, Sahin F. 2014. A survey on feature selection methods. Computers & Electrical Engineering 40:16-28

[7] Chen J, Zeng G-Q, Zhou W, Du W, Lu K-DJEC, Management. 2018. Wind speed forecasting using nonlinear-learning ensemble of deep learning time series prediction and extremal optimization. Energy conversion and management 165:681-695

[8] Chizi B, Rokach L, Maimon O. 2009. A survey of feature selection techniques. In: Encyclopedia of Data Warehousing and Mining (Second Edition). IGI Global. 1888-1895

[9] Drucker H, Burges CJC, Kaufman L, Smola AJ, Vapnik V. 1997. Support vector regression machines. In: Advances in Neural Information Processing Systems (Vol. 9). Cambridge MA: MIT press.

[10] Fu Y, Gao Z, Liu Y, Zhang A, Yin X. 2020. Actuator and sensor fault classification for wind turbine systems based on fast fourier transform and uncorrelated multi-linear principal component analysis techniques. Processes 8(9):1066

[11] Gao Z, Liu X, 9. 2021. An overview on fault diagnosis, prognosis and resilient control for wind turbine systems. Processes

[12] Heng J, Wang C, Zhao X, Xiao LJS. 2016. Research and application based on adaptive boosting strategy and modified CGFPA algorithm: a case study for wind speed forecasting. Sustainability 8(3):235

[13] Hong Y-Y, Satriani TRA. 2020. Day-ahead spatiotemporal wind speed forecasting using robust design-based deep learning neural network. Energy 209:118441

[14] Hu J, Wang J. 2015. Short-term wind speed prediction using empirical wavelet transform and Gaussian process regression. Energy 93:1456-1466

[15] Hu Y-L, Chen Ljec, Management. 2018. A nonlinear hybrid wind speed forecasting model using LSTM network, hysteretic ELM and Differential Evolution algorithm. Energy conversion and management 173:123-142

[16] Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, Yen N-C, Tung CC, Liu Hhjpotrsolsam, physical and sciences e. 1998. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proceedings of the Royal Society of London. Series A: mathematical, physical and engineering sciences 454(1971):903-995

[17] Hung JC, Yen NY, Li K-C. 2016. Frontier computing: theory, technologies and applications. Springer.

[18] Kavasseri RG, Seetharaman K. 2009. Day-ahead wind speed forecasting using f-ARIMA models. Renewable Energy 34:1388-1393

[19] Khodayar M, Wang J. 2019. Spatio-temporal graph deep neural network for short-term wind speed forecasting. IEEE Transactions on Sustainable Energy 10:670-681

[20] Khosravi A, Koury R, Machado L, Pabon JJSET, Assessments. 2018. Prediction of wind speed and wind direction using artificial neural network, support vector regression and adaptive neuro-fuzzy inference system. Sustainable Energy Technologies and Assessments 25:146-160

[21] Kong X, Liu X, Shi R, Lee KY. 2015. Wind speed prediction using reduced support vector machines with feature selection. Neurocomputing 169:449-456

[22] Li C, Xiao Z, Xia X, Zou W, Zhang CJAE. 2018a. A hybrid model based on synchronous optimisation for multi-step short-term wind speed forecasting. Applied Energy 215:131-144

[23] Li H, Wang J, Lu H, Guo ZJRE. 2018b. Research and application of a combined model based on variable weight for short term wind speed forecasting. Renewable Energy 116:669-684

[24] Liu H, Mi X, Li Y, Duan Z, Yjre Xu. 2019a. Smart wind speed deep learning based multi-step forecasting model using singular spectrum analysis. Convolutional Gated Recurrent Unit Network and Support Vector Regression 143:842-854

[25] Liu H, Mi X, Li YJEC, Management. 2018a. Smart deep learning based wind speed prediction model using wavelet packet decomposition, convolutional neural network and convolutional long short term memory network. Energy Conversion and Management 166:120-131

[26] Liu H, Mi X, Li YJEC, Management. 2018b. Smart multi-step deep learning model for wind speed forecasting based on variational mode decomposition, singular spectrum analysis, LSTM network and ELM. Energy Conversion and Management 159:54-64

[27] Liu M, Cao Z, Zhang J, Wang L, Huang C, Luo X. 2020. Short-term wind speed forecasting based on the Jaya-SVM model. International Journal of Electrical Power & Energy Systems 121:106056

[28] Liu Y, Shi H, Huang S, Chen X, Zhou H, Chang H, Xia Y, Wang G, Xjqiim Yang, Surgery. 2019b. Early prediction of acute xerostomia during radiation therapy for nasopharyngeal cancer based on delta radiomics from CT images. Quantitative imaging in medicine and surgery 9(7):1288

[29] Lydia M, Kumar SS, Selvakumar AI, Kumar GEPJEC, Management. 2016a. Linear and non-linear autoregressive models for short-term wind speed forecasting. Energy conversion and management 112:115-124

[30] Lydia M, Kumar SS, Selvakumar AI, Kumar GEP. 2016b. Linear and non-linear autoregressive models for short-term wind speed forecasting. Energy Conversion and Management 112:115-124

[31] Nair KR, Vanitha V, Jisma M. 2017. Forecasting of wind speed using ann, arima and hybrid models. In: 2017 international conference on intelligent computing, instrumentation and control technologies (ICICICT). Piscataway: IEEE. 170-175

[32] Niu T, Wang J, Zhang K, Du P. 2018. Multi-step-ahead wind speed forecasting based on optimal feature selection and a modified bat algorithm with the cognition strategy. Renewable Energy 118:213-229

[33] Paramasivan SK, Lopez Djijorer. 2016. Forecasting of wind speed using feature selection and neural networks. International Journal of Renewable Energy Research (IJRER) 6(3):833-837

[34] Park T, Casella Gjjotasa. 2008. The Bayesian lasso. 103:681-686

[35] Poggi P, Muselli M, Notton G, Cristofari C, Louche A. 2003. Forecasting and simulating wind speed in Corsica by using an autoregressive model. Energy Conversion and Management 44:3177-3196

[36] Tibshirani R. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58(1):267-288

[37] Santamaría-Bonfil G, Reyes-Ballesteros A, Gershenson CJRE. 2016. Wind speed forecasting for wind farms: a method based on support vector regression. Renewable Energy 85:790-809

[38] Sun W, Wang Y. 2018. Short-term wind speed forecasting based on fast ensemble empirical mode decomposition, phase space reconstruction, sample entropy and improved back-propagation neural network. Energy Conversion and Management 157:1-12

[39] Tascikaraoglu A, Sanandaji BM, Poolla K, Varaiya P. 2016. Exploiting sparsity of interconnections in spatio-temporal wind speed forecasting using Wavelet Transform. Applied Energy 165:735-747

[40] Torres JL, Garcia A, De Blas M, De Francisco AJSe. 2005. Forecast of hourly average wind speed with ARMA models in Navarre (Spain) Solar energy 79(1):65-77

[41] Wang HZ, Wang GB, Li GQ, Peng JC, Liu YT. 2016a. Deep belief network based deterministic and probabilistic wind speed forecasting approach. Applied Energy 182:80-93

[42] Wang J, Hu JJE. 2015. A robust combination approach for short-term wind speed forecasting and analysis–Combination of the ARIMA (Autoregressive Integrated Moving Average), ELM (Extreme Learning Machine), SVM (Support Vector Machine) and LSSVM (Least Square SVM) forecasts using a GPR (Gaussian Process Regression) model. Energy 93:41-56

[43] Wang L, Li X, Bai Y. 2018. Short-term wind speed prediction using an extreme learning machine model with error correction. Energy Conversion and Management 162:239-250

[44] Wang S, Zhang N, Wu L, Wang Y. 2016b. Wind speed forecasting based on the hybrid ensemble empirical mode decomposition and GA-BP neural network method. Renewable Energy 94:629-636

[45] Wang X. 2017. Forecasting short-term wind speed using support vector machine with particle swarm optimization. In: 2017 international conference on sensing, diagnostics, prognostics, and control (SDPC). Piscataway: IEEE. 241-245