The reconstruction of equivalent underlying model based on direct causality for multivariate time series
Author and article information
Abstract
This article presents a novel approach for reconstructing an equivalent underlying model and deriving a precise equivalent expression through the use of direct causality topology. Central to this methodology is the transfer entropy method, which is instrumental in revealing the causality topology. The polynomial fitting method is then applied to determine the coefficients and intrinsic order of the causality structure, leveraging the foundational elements extracted from the direct causality topology. Notably, this approach efficiently discovers the core topology from the data, reducing redundancy without requiring prior domain-specific knowledge. Furthermore, it yields a precise equivalent model expression, offering a robust foundation for further analysis and exploration in various fields. Additionally, the proposed model for reconstructing an equivalent underlying framework demonstrates strong forecasting capabilities in multivariate time series scenarios.
Cite this as
2024. The reconstruction of equivalent underlying model based on direct causality for multivariate time series. PeerJ Computer Science 10:e1922 https://doi.org/10.7717/peerj-cs.1922Main article text
Introduction
In the contemporary industrial landscape, multivariate time series data is abundant, offering a wealth of information surpassing that of univariate counterparts. This intricate data landscape faithfully reflects the complexity inherent in various systems. Leveraging time series analysis facilitates the translation of historical and current system-generated data into predictive insights for the future. Crucially, this analytical approach involves the discernment of relationships within multivariate time series data. In fields like control and automation, it holds vast potential across diverse domains, design, fault diagnosis, risk assessment, root cause analysis, encompassing analysis, and the identification of corresponding alarm sequences, which harness the underlying model through an exploration of connectivity and causality (Khandekar & Muralidhar, 2014; Yan et al., 2024). Moreover, the act of capturing causality among variables within time series data holds profound significance across numerous scientific domains, especially in situations where the underlying model remains insufficiently understood. When delving into the intricacies of the climate system, the revelation of causality unveils a sophisticated internal framework within the intricate networks governing the complexities of climate (Yang et al., 2022; Gupta & Jain, 2021). Furthermore, the identification of causal relationships within economic data and the inference of functional connectivity in neuroscience are of paramount importance (Antonietti & Franco, 2021; Celli, 2022; Weichwald & Peters, 2021; Varsehi & Firoozabadi, 2021).
Numerous scholars have diligently pursued the task of inferring causality within the scope of analyzing industrial processes distinguished by clearly defined physical foundations. A range of data-driven methodologies has been deployed to discern causality (Wang et al., 2023b; Kong et al., 2024; Barfoot, 2024), including notable techniques such as Granger causality (Fuchs et al., 2023), transfer entropy (Schreiber, 2000), Bayesian networks (Chen et al., 2021b), and Patel’s pairwise conditional probability approach (Patel, Bowman & Rilling, 2006). These methodologies excel in achieving process topology, yet they differ from the objective of reconstructing an equivalent underlying model.
Nonetheless, reconstructing an equivalent underlying model remains crucial for comprehensively characterizing multivariate time series. This reconstructed underlying model serves as a means to restore the inherent dynamics of the original system, encapsulating a substantial portion of its informational content. This equivalent underlying model, aligned with the intricacies of the initial dynamical system, forms the foundational bedrock for the analysis of scalar time series.
Some methods have been used to reconstruct state space, including independent components analysis (Heinz et al., 2021), modified false nearest neighbors approach (Boccaletti et al., 2002), uniform embedding method, and non-uniform multivariate embedding (Gu & Chou, 2022). At first, uniform embedding methods are a common tool, and Takens’ embedding theorem is the most significant method, a practical technique for multivariate time series. However, some issues arise when handling multiple periodicities (Han et al., 2018). Consequently, a growing contingent of scholars is embracing the non-uniform embedding technique to dissect multivariate time series (Faes, Nollo & Porta, 2012). This approach holds significant value in the reconstruction of an underlying model capable of accommodating the varying time lags inherent in time series data. However, selecting embedding variables is paramount; they must possess the capacity to capture the intricacies of complex systems faithfully. Furthermore, when dealing with multivariate time series, the chosen variables must maintain their independence from one another (Han et al., 2018; Vlachos & Kugiumtzis, 2010). In addition, joint mutual information and conditional entropy are applied to deal with multivariate embedding (Han et al., 2018; Jia et al., 2020). Given the intricacies of complex systems, the influence of observational data, and the interplay of interactions, characterizing an underlying dynamic system from observed time series stands as an essential and formidable challenge in natural science (Shao et al., 2014). This work presents substantial potential across various fields, notably in predictive analytics for complex systems (Gu & Chou, 2022), healthcare and biomedical engineering (Wang et al., 2020), and industrial and mechanical systems (Chen et al., 2021a). Its applicability also extends to financial modeling (Chan & Strachan, 2023), the development of autonomous systems and AI (Ma, Wang & Peng, 2023), as well as in climate modeling (Ren et al., 2023), and the energy sector (Gunjal et al., 2023). Furthermore, there is a pressing need for future research to refine and improve methods used in detecting direct causality within complex and noisy datasets, enhancing overall accuracy and reliability.
This article introduces a novel approach for reconstructing an equivalent underlying model and yielding a precise equivalent expression, utilizing direct causality topology. The major contributions are summarized as follows:
-
In the pursuit of underlying model identification, the transfer entropy method emerges as a pivotal tool for unveiling the topology of causality. Moreover, the conditional transfer entropy or conditional mutual information approach is harnessed to delineate the direct causality structure precisely. This approach can be further tailored to deduce a collection of fundamental elements, facilitating the reconstruction of an equivalent underlying model grounded in multivariate time series data.
-
The polynomial fitting method determines the coefficients and intrinsic order of the causality structure based on the foundational elements derived from the direct causality topology. This method serves as a means to accomplish the task of underlying model identification, all from the vantage point of multivariate time series, without any reliance on prior knowledge. The culmination of this process leads to the discernment of the expression characterizing the equivalent underlying model.
The rest of sections of this article are systematized as follows. ‘Related Backgrounds’ provides an overview of the background related to previous works. The ‘Algorithm’ presents a comprehensively explains our proposed method, outlining its structure by information theory and polynomial fitting techniques. ‘Simulation Study’ presents results from two distinct cases. Finally, we conclude our discussion in ‘Conclusions’.
Algorithms
This section elucidates the formulation and computation of various information-theoretic measures, including joint Shannon entropy, mutual information (MI), conditional mutual information (CMI), transfer entropy (TE), and conditional transfer entropy (CTE). Subsequently, we establish a direct topological representation of the underlying model. This endeavor involves attaining an expression for the equivalent underlying model by utilizing a polynomial fitting algorithm. Conclusively, the framework of our proposed method, designed for the reconstruction of an equivalent underlying model from multivariate time series data, is introduced.
Entropy
Within the domain of information theory, Shannon entropy characterizes the average degree of “information,” “surprise,” or “uncertainty” associated with the potential outcomes of a random variable (Shannon, 1948). When considering a discrete random variable X, this variable assumes values from the alphabet χ and is distributed according to p:X → [0, 1], such that probability density functions (PDFs) p(x): = ℙ[X = x]: (1)H(X)=E{−logp(X)}=−∑x∈χp(x)logp(x) where 𝔼 is the expected value operator. In the context of multivariate scenarios (q-dimensional), the estimation of PDFs ˆp(x1,x2,…,xq), which is commonly used to approximate p(x1,x2,…,xq), is: (2)p(x1,x2,…,xq)≈ˆp(x1,x2,…,xq)=1Nh1⋯hqN∑i=1K(x1−xi1h1)(xq−xiqhq) where N is the number of samples hs=c⋅σ(xis)Ni=1⋅N−1(4+q) for s = 1, …, q. Conditional entropy can be defined with respect to two variables X and Y, which assume values from sets X and Y, respectively, as: (3)H(X|Y)=−∑x,y∈X×YpX,Y(x,y)logpX,Y(x,y)pY(y) where pX,Y(x,y) is the joint PDFs of random variables X and Y. pY(y) is the marginal PDFs of Y.
Mutual information
Mutual information quantifies the interdependence shared between the two variables (Cover, 1999), as shown in Fig. 1. In the case of two jointly discrete random variables, X and Y, the definition of mutual information (MI) is as follows: (4)I(X;Y)=H(X)−H(X|Y).
Figure 1: Venn diagram (Cover, 1999) shows additive and subtractive associations among diverse information metrics pertaining to correlated variables X and Y.
The joint entropy H(X,Y) contains the area contained by either circle. The left circle symbolizes the individual entropy H(X), where the crescent-shaped denotes the conditional entropy H(X|Y). The right circle signifies H(Y), with the crescent-shaped representing H(Y|X). The intersection between these circles denotes the mutual information I(X;Y).Substituting Eqs. (1) and (3) into (4) yield: (5)I(X;Y)=∑y∈Y∑x∈XpX,Y(x,y)logpX,Y(x,y)pX(x)pY(y) where pX,Y(x,y) is the joint PDFs of X and Y.pX(x) and pY(y) are the PDFs of X and Y, respectively.
Conditional mutual information
Conditional mutual information is the formal definition of the mutual information between two random variables when conditioned on a third variable, as illustrated in Fig. 2. For jointly discrete random variables, this definition can be expressed as follows: (6)I(X;Y|Z)=H(X,Z)+H(Y,Z)−H(X,Y,Z)−H(Z)=H(X|Z)−H(X|Y,Z)=H(X|Z)+H(Y|Z)−H(X,Y|Z)=I(X;Y,Z)−I(X;Z).
Figure 2: Venn diagram of information-theoretic measures for three variables X, Y, and Z, represented by the lower left, lower right, and upper circles, respectively.
The conditional mutual information I(X;Z|Y), I(Y;Z|X), and I(X;Y|Z) are represented by the red dotted line, blue dotted line, and yellow dotted line, respectively.Substituting Eqs. (5) into (6) yield: (7)I(X;Y|Z)=∑z∈Z∑y∈Y∑x∈XpX,Y,Z(x,y,z)logpX,Y,Z(x,y,z)pZ(z)pX,Z(x,z)pY,Z(y,z).
In the context of discrete variables, Eq. (7) has been employed as a fundamental building block for establishing various equalities within the field of information theory.
Transfer entropy
Transfer entropy serves as a non-parametric statistical metric designed to quantify the extent of directed (time-asymmetric) information transfer between two stochastic processes (Schreiber, 2000; Seth, 2007). TE provides an information-theoretic methodology for evaluating causality by quantifying uncertainty reduction. According to Schreiber (2000), the TE from a stationary time series X to Y is defined as: (8)TEX→Y=I(Y;X|Y) where, X and Y denotes the historical information of X and Y. It indicates that Eq. (8) is a special CMI (Wyner, 1978; Dobrushin, 1963) in Eq. (6) with the history of the influenced variable Y in the condition, which is used widely in practice.
Substituting Eqs. (7) into (8) yield: (9)TEX→Y=∑p(yi+h,y(k)i,x(d)j)⋅logp(yi+h|y(k)i,x(d)j)p(yi+h|y(k)i) where p means the PDFs, xj={xj,xj−τ,⋯,xj−(d−1)τ},yi={yi,yi−τ,⋯,yi−(k−1)τ}, τ represents the sampling period, and h denotes the prediction horizon. k signifies the embedding dimension for variable yi and d represents the embedding dimension for variable xj.TEX→Y from X to Y can be conceptualized as the improvement gained by using the historical data of both X and Y to forecast the future of Y, as opposed to solely relying on the historical data of Y alone. Essentially, it measures the information regarding an upcoming observation of variable Y obtained from the simultaneous observations of past values of both X and Y, while subtracting the information solely derived from Y’s past values concerning its future.
In a more detailed context, given two random processes and gauge the quantity of information using Shannon entropy (Shannon, 1948), the TEX→Y can be expressed as: (10)TEX→Y=H(yi+h|y(k)i)−H(yi+h|y(k)i,x(d)j) where H is the Shannon entropy of X.
Derived from Eqs. (9) or (10), an indirect causality topology based on multivariate time series can be obtained, as shown in Fig. 3.
Figure 3: Two different patterns of causality topology between variables A, B, C, and D.
Consider four processes denoted as A, B, C, and D, as illustrated in Fig. 3. Suppose the TE method uncovers a causal influence from A to D. Within this context, two distinct causal patterns exist from A to D. However, it is essential to note that the TE, as described in Eqs. (9) or (10), is unable to distinguish whether this impact is direct or if it is transmitted through B. In other words, the TE lacks the capability to differentiate between a direct causality relationship and an indirect one. Additional methods must be incorporated to address the need for an equivalent underlying model to capture direct causality.
Conditional transfer entropy
In a complex system, it is not always guaranteed that TE exclusively represents the directed causal influence from X to Y. Information shared between X and Y may be conveyed through an intermediary third process, denoted as Z. In such scenarios, the presence of this shared information can lead to an augmentation in TE, potentially confounding the causal interpretation. One possible approach is conditional transfer entropy, which reduces the influence of shared information through alternative processes. CTE from a single source X to the target Y, while excluding information from Z, is then defined as follows: (11)CTEX→Y|Z=I(yi+h;x(d)j|y(k)i,z(n)l) where n represents the embedding dimension for variable zl.
In the case mentioned in Fig. 3, the direct causality relationship can be distinguished from the indirect one via the CTE method to capture the direct causal topology. When the causal impact from A to D is entirely conveyed through B, CTEA→D|B is uniformly zeros theoretically, which means that the addition of past measurements of A, contingent on B, will not enhance the prediction of D any further. Conversely, when there exists a direct influence from A to D, incorporating past measurements of A alongside those of B and D leads to enhanced predictions of D, resulting in a causal effect CTEA→D|B > 0.
Utilizing MI in Eq. (5), CMI in Eq. (7), TE in Eq. (9), and CTE in Eq. (11), a set of foundational elements derived from the direct causality topology. Given four variables A, B, C, and D, suppose that the direct causal inference can be obtained, as shown in Fig. 4.
Figure 4: Direct causality topology between variables A, B, C, and D.
Derived from the direct causality topology depicted in Fig. 4, we can ascertain the sets of fundamental elements associated with each variable. The equivalent foundational model can be articulated as follows: (12)A(t)=fA(AA)+ϵAB(t)=fB(AB,BB)+ϵBC(t)=ϵCD(t)=fD(BD,CD)+ϵD where each fi(x) represents the equivalent underlying model of variable i, i = A, B, C, D. x signifies a collection of fundamental elements corresponding to variable i. For instance, consider the case where CD={C(t−1),C(t−2)}. Here, CD presents the measurements of C at sampling stamp t − 1 and t − 2, directly contributing to the causal relationship to D. It is worth noting that x can be determined through MI, TE, CMI, and CTE methods. In the subsequent section, the parameters and order of the functions fi will be determined using a polynomial fitting approach.
Significance test
By conducting N-trial Monte Carlo simulations (Kantz & Schreiber, 2004), we compute the MIs, TEs, CMIs and CTEs for various surrogate time series, following Eqs. (5), (9), (7) and (11). These surrogate time series are randomly generated, and any potential causality between the time series is intentionally removed. In Fig. 5, the TEs for 10,000 pairs of uncorrelated random series are presented (partial results are presented here, with all results displayed in Supplemental Information). The results suggest that the MIs, TEs, CMIs and CTEs derived from the surrogate data demonstrate a Gaussian distribution, represented as N(μ,σ2). If we denote the MIs, TEs, CMIs and CTEs as random variables X, the PDF can generally be expressed as follows: (13)f(x)=1σ√2πe−12(x−μσ)2 where the parameter µrepresents the expected value of the distribution, while the parameter σ denotes its standard deviation. The variance of the distribution is expressed as σ2. Let Z=(X−u)σ∼N(0,1), the PDF of Z can be written as: (14)φ(z)=f(x−μσ)=1√2πe−12z2.
Figure 5: TEs from white Gaussian noise to a specific variable with lag, derived from N-trial Monte Carlo simulations, in case 1.
The cumulative distribution function (CDF) of the standard normal distribution, typically denoted by the capital Φ, is defined as the integral: (15)Φ(z)=1√2π∫z−∞e−t22dt.
There exists α > 0 such that: (16)P(z>zα)=1−Φ(zα)=α.
The quantile Φ−1(1−α) of the standard normal distribution is frequently symbolized as zα. A normal standard random variable will lie beyond the bounds of the confidence interval (−∞,zα] with probability α, which is a small positive number, often 0.05, denoting as z0.05. It means that 95% of the MIs, TEs, CMIs and CTEs derived from uncorrelated random series should be included in (−∞,z0.05]. We could also say that with 95% confidence, we reject the null hypothesis that the MIs, TEs, CMIs and CTEs are insignificant. Therefore, if I(X;Y)>z0.05 or P(I(X;Y)−μσ>z0.05)<α, it implies a noteworthy connection between variable X and Y. Moreover, if TEX→Y > z0.05 or P(TEX→Y−μσ>z0.05)<α, it implies a noteworthy causal connection from variable X to Y. Meanwhile, TEX→Y|Z > z0.05 or P(TEX→Y|Z−μσ>z0.05)<α , it indicates a significant causality from variable X to Y, excluding information from Z. Additionally, if I(X;Y|Z)>z0.05 or P(I(X;Y|Z)−μσ>z0.05)<α, it indicates a significant connection between variable X and Y, excluding information from Z. It should be noted that the value of z0.05 depends on the specific distribution.
Polynomial fitting
The polynomial fitting method, employing a maximum exponent of k for each input feature, utilizes X={X1,X2,…,Xn} to create a polynomial features matrix A. This matrix comprises Cnk+n features, consequently leading to the establishment of a linear system: (17)Y(t)=ABA={1,X1,X2,…,Xn,X21,X22⋯,Xkn︸Cnk+n} where B is the coefficients and intercept. The Eq. (17) can be effectively solved using the “preprocessing.PolynomialFeatures” and “linear_model.LinearRegression” classes within the Python package “sklearn”.
Simulation Study
In this section, two cases are given to show the usefulness of the proposed method. The initial situation pertains to a basic discrete-time dynamical system representing a mathematical model. The second case is the Henon map, which illustrates dynamic systems exhibiting chaotic behavior.
Case 1: simulation case
In the first instance, we have a discrete-time dynamical system. The expression is as follows: (18)X(t)=[σY(t−1)−ρZ(t−1)][Y(t−1)−βZ(t−1)]+0.01∗ϵ1Y(t)=0.5∗ϵ2Z(t)=0.5∗ϵ3 where the two parameters are σ = 0.9, ρ = 0.7, and β=43. The initial conditions are X(0) = 0.2, Y(0) = 0.4, and Z(0) = 0.3. The ϵi∼ N(0,1), where i = 1, 2, 3, represent the White Gaussian Noise. The trend generated by the time series derived from Eq. (18) is shown in Fig. 6.
Figure 6: Trend of case 1.
Table 1 presents a comprehensive overview of TEs between each pair of variables with their respective time lags. In this table, p denotes the probability and the z0.05, upper α quantile (α = 0.05), serves as the upper boundary of the confidence interval.
Cause | Effect | lag = 1 | lag = 2 | ||||
---|---|---|---|---|---|---|---|
TEcause→effect | z0.05 | p(α = 0.05) | TEcause→effect | z0.05 | p(α = 0.05) | ||
X | Y | 0.5566 | 0.8873 | 1 | 0.5474 | 0.8891 | 1 |
X | Z | 0.5712 | 0.9245 | 1 | 0.5695 | 0.9253 | 1 |
Y | X | 0.5033 | 0.4694 | 0.0009 | 0.4189 | 0.4701 | 0.7172 |
Y | Z | 0.8402 | 0.9254 | 0.6417 | 0.8345 | 0.9257 | 0.6987 |
Z | X | 0.5247 | 0.4697 | 0 | 0.4385 | 0.4692 | 0.3841 |
Z | Y | 0.8292 | 0.8879 | 0.4132 | 0.8128 | 0.8891 | 0.5876 |
The causality topology, derived from Tables 1 and 2 through TE and MI analysis, is visually depicted in Fig. 7. Notably, it becomes evident that there is only one unique pattern of causality between each pair of variables, each associated with their respective time lags. In other words, there is no presence of indirect causal influence. Consequently, the equivalent underlying model can be succinctly expressed as follows: (19)X(t)=fX(Y(t−1),Z(t−1))+ϵXY(t)=ϵYZ(t)=ϵZ.
Cause | Effect | lag = 1 | lag = 2 | ||||
---|---|---|---|---|---|---|---|
I(cause;effect) | z0.05 | p(α = 0.05) | I(cause;effect) | z0.05 | p(α = 0.05) | ||
X | X | 0.0919 | 0.1341 | 0.9947 | 0.0975 | 0.1340 | 0.9756 |
Y | Y | 0.1517 | 0.1644 | 0.2670 | 0.1576 | 0.1645 | 0.1385 |
Z | Z | 0.1309 | 0.1633 | 0.8454 | 0.1447 | 0.1638 | 0.4595 |
Figure 7: Causality topology between each pair of variables with their respective time lags.
In the subsequent step, we employ polynomial approximation to estimate the genuine underlying model. The parameters and order of the equivalent fX are determined through the polynomial fitting approach as outlined in Eq. (18). In this process, we opt for a higher order k for fX: (20)X(t)=fX(Y(t−1),Z(t−1))+ϵX=AB where A=[1,Y(t−1)1,Z(t−1)1,Y(t−1)2,Y(t−1)1Z(t−1)1,Z(t−1)2,…,Z(t−1)k], B = [b0,0], [b1,0, b0,1, b2,0, b1,1, b0,2⋯, b1,k−1, b0,k]T. When selecting k = 4, these parameters can be obtained by fitting, b0,4 = 3.2∗10−3, b1,3 = 3.2∗10−3, b2,2 = − 3.2∗10−3, b3,1 = − 4.8∗10−3, b4,0 = 2.1∗10−3, b0,3 = − 3.4∗10−3, b1,2 = − 6.8∗10−3, b2,1 = − 2.9∗10−3, b3,0 = 7.7∗10−4, b0,2 = 0.93, b1,1 = 1.9, b2,0 = 0.9, b0,1 = 2.4∗10−3, b1,0 = 1.5∗10−3, b0,0 = 1.9∗10−4. Disregarding coefficients that are insignificantly close to zero, the expression in Eq. (20) can be redefined as follows: (21)X(t)≈b0,2Z(t−1)2+b1,1Y(t−1)Z(t−1)+b2,0Y(t−1)2+ϵX=0.93∗Z(t−1)2−1.89∗Y(t−1)Z(t−1)+0.9∗Y(t−1)2+ϵX
It is evident that the underlying model in Eq. (18) can be concisely represented by Eq. (21). The training dataset consists of 700 samples in our experimental configuration, while the testing dataset comprises 300 samples. Figure 8 elegantly showcases the fitting curves of the time series achieved through the proposed method. Remarkably, when supervised, the original series and the polynomial fit data (derived from the equivalent underlying model) exhibit an exceptionally close alignment. In contrast, without supervision, the performance is significantly reduced. This reduction in performance is likely attributed to Y(t) and Z(t) relying solely on white Gaussian noise, which exhibits a considerable degree of randomness.
Figure 8: Fitting curves of time series based on the proposed method.
The “supervised fit” signifies that each input is derived from the Ground Truth to compute the current output. In contrast, “unsupervised fit” denotes that an initial input is provided, and all outputs are calculated through iterative processes without relying on Ground Truth.Case 2: Henon map chaotic time series
The Henon map constitutes a discrete-time dynamical system characterized by chaotic behavior. Here is the expression of the Henon map system: (22)X(t+1)=1−αX2(t)+Y(t)+0.0001∗ϵXY(t+1)=βX(t)+0.0001∗ϵY where the parameters α and β are set to α = 1.4 and β = 0.3, respectively, while the initial states are X(0) = 0 and Y(0) = 0. The trend generated by the time series derived from Eq. (22) is shown as Fig. 9.
Figure 9: Trend of case 2.
Table 3 presents a comprehensive overview of TEs between each pair of variables with their respective time lags. Notably, the TE method has detected the causal influences from Y(t−1) to X(t), Y(t−2) to X(t), X(t−1) to Y(t).
Cause | Effect | lag = 1 | lag = 2 | ||||
---|---|---|---|---|---|---|---|
TEcause→effect | z0.05 | p(α = 0.05) | TEcause→effect | z0.05 | p(α = 0.05) | ||
X | Y | 1.3323 | 0.4866 | 0 | 0.0093 | 0.4842 | 1 |
Y | X | 0.7191 | 0.4823 | 0 | 0.7068 | 0.4804 | 0 |
The causality topology, derived from Tables 3 and 4 through both TE and MI analyses, is visually represented in Fig. 10. Notably, there exist two types causal influences, direct and indirectly, among the following three pairs of variables: X(t−2) to X(t), Y(t−2) to X(t), and Y(t−2) to Y(t). To further investigate these causal relationships and distinguish whether the influence is direct or mediated by three, we employ CMI or CTE to capture direct causality.
Cause | Effect | lag = 1 | lag = 2 | ||||
---|---|---|---|---|---|---|---|
I(cause;effect) | z0.05 | p(α = 0.05) | I(cause;effect) | z0.05 | p(α = 0.05) | ||
X | X | 1.5725 | 0.1938 | 0 | 1.2878 | 0.1942 | 0 |
Y | Y | 1.569 | 0.1939 | 0 | 1.2801 | 0.1942 | 0 |
Figure 10: Causality topology between each pair of variables with their respective time lags.
Each of these three undetermined causal relationships involves its historical measurements. Consequently, the CMI is more suitable for distinguishing whether a third variable’s influence is direct or mediated. The outcomes of the analysis using CMI are outlined in Table 5. The results indicate that these causal relationships are indeed direct rather than indirect. This implies that the red arrows should be removed, and the rest of the diagram reflects the direct causal topology (shown in Fig. 10).
Cause | Effect | Condition | I(cause;effect|condition) | z0.05 | p(α = 0.05) |
---|---|---|---|---|---|
X(t−2) | X(t) | X(t−1) | 0.7160 | 1.7472 | 1.0 |
Y(t−2) | X(t) | Y(t−1) | 0.7765 | 1.6557 | 1.0 |
Y(t−2) | Y(t) | Y(t−1) | 0.7165 | 1.7454 | 1.0 |
Consequently, the equivalent underlying model can be succinctly expressed as follows: (23)X(t)=fX(X(t−1),Y(t−1))+ϵXY(t)=fY(X(t−1),Y(t−1))+ϵY
In the subsequent step, we employ polynomial approximation to estimate the genuine underlying model. When choosing k = 4, these parameters of fX can be obtained by fitting. b0,0 = 1, b0,1 = 1, b2,0 = − 1.4, and the rest of the coefficients are insignificantly close to zero, one expression in Eq. (23) can be redefined as follows: (24)X(t)≈1+Y(t−1)−1.4∗X(t−1)2+ϵX.
Similarly, another the expression in Eq. (23) can be redefined as follows: (25)Y(t)≈0.3∗X(t−1)+ϵY.
The underlying model of the Henon map can be concisely represented by Eqs. (24) and (25). Within our experimental configuration, the training dataset consists of 700 samples, while the testing dataset comprises 300 samples. Figure 11 elegantly portrays the fitting curves of the time series achieved through the proposed method. Both supervised and unsupervised cases exhibit an exceptionally close alignment between the original series and the polynomial fit data derived from the equivalent underlying model. This outcome is likely due to X(t) and Y(t) not relying on White Gaussian Noise, thereby reducing the influence of randomness.
Figure 11: Causality topology between each pair of variables with their respective time lags.
The “supervised fit” signifies that each input is derived from the Ground Truth to compute the current output. In contrast, “unsupervised fit” denotes that an initial input is provided, and all outputs are calculated through iterative processes without relying on Ground Truth.Conclusions
This article introduces an innovative and effective method for reconstructing an equivalent underlying model, built upon the direct causality topology derived from multivariate time series data. MI and TE are utilized to map out the causality structure, whereas CMI or CTE is employed to distinguish between direct causal influences and indirect ones. By employing the polynomial fitting method, we are able to derive the expression of the equivalent underlying model solely based on the direct topology without the need for prior knowledge. This method provides a systematic and data-driven approach for uncovering the inherent causal structures within multivariate time series data. It allows for accurately identifying and reconstructing an equivalent underlying model, enhancing our understanding and interpretation of complex systems represented in the data. This study focuses on reconstructing an equivalent underlying model that primarily identifies dynamic systems. This approach can be seamlessly adapted for other applications, including correlation and causality analysis. Moving forward, our objective is to refine these reconstructed models further. This work includes integrating AI and machine learning techniques to enhance the accuracy and efficiency of the models and expand their applicability across diverse domains.
Supplemental Information
Equivalent underlying model data
Case 1 is a discrete-time dynamical system. Case 2 is a Henon map chaotic time series. For each case, the training dataset consists of 700 samples, while the testing dataset comprises 300 samples.
Additional Information and Declarations
Competing Interests
The authors declare there are no competing interests.
Author Contributions
Liyang Xu conceived and designed the experiments, performed the experiments, analyzed the data, authored or reviewed drafts of the article, and approved the final draft.
Dezheng Wang conceived and designed the experiments, performed the experiments, analyzed the data, performed the computation work, prepared figures and/or tables, and approved the final draft.
Data Availability
The following information was supplied regarding data availability:
The simulated cases are available in the GitHub and Zenodo:
- https://github.com/Xu-Liyang/case-reconstruction.git.
- Wang, D. (2024). The reconstruction of equivalent underlying model based on direct causality for multivariate time series. Zenodo. https://doi.org/10.5281/zenodo.10668929.
Funding
This research was funded by the Science and Technology Research Program of Chongqing Municipal Education Commission of China (Grant Nos. KJZD-K202201901, KJQN202201109, and KJQN202101904). This work was also supported by the Innovation Research Group of Universities in Chongqing (Grant No. CXQT21035), and the Scientific Research Foundation of Chongqing Institute of Engineering (Grant No. 2020xzky05). Additionally, this work was supported by Natural Science Foundation of Chongqing (Grant No. CSTB2022NSCQ-MSX1419, cstc2021jcyj-msxmX0525) and the Scientific and Technological Research Key Program of Chongqing Municipal Education Commission (Grant No. KJZD-M202201901). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.