Electrophysiological research with event-related brain potentials (ERPs) is increasingly moving from simple, strictly orthogonal stimulation paradigms towards more complex, quasi-experimental designs and naturalistic situations that involve fast, multisensory stimulation and complex motor behavior. As a result, electrophysiological responses from subsequent events often overlap with each other. In addition, the recorded neural activity is typically modulated by numerous covariates, which influence the measured responses in a linear or non-linear fashion. Examples of paradigms where systematic temporal overlap variations and low-level confounds between conditions cannot be avoided include combined electroencephalogram (EEG)/eye-tracking experiments during natural vision, fast multisensory stimulation experiments, and mobile brain/body imaging studies. However, even “traditional,” highly controlled ERP datasets often contain a hidden mix of overlapping activity (e.g., from stimulus onsets, involuntary microsaccades, or button presses) and it is helpful or even necessary to disentangle these components for a correct interpretation of the results. In this paper, we introduce

Event-related brain responses in the electroencephalogram (EEG) are traditionally studied in strongly simplified and strictly orthogonal stimulus-response paradigms. In many cases, each experimental trial involves only a single, tightly controlled stimulation, and a single manual response. In recent years, however, there has been a rising interest in recording brain-electric activity also in more complex paradigms and naturalistic situations. Examples include laboratory studies with fast and concurrent streams of visual, auditory, and tactile stimuli (

Appropriate analysis of such datasets requires a paradigm shift away from simple averaging techniques towards more sophisticated, regression-based approaches (e.g.,

In the present paper, we introduce

In the following, we first briefly summarize some key concepts of regression-based EEG analysis, with an emphasis on linear deconvolution, spline regression, and temporal basis functions. We then describe the

Before we introduce a real dataset, let us first consider a simulated simple EEG/ERP study to illustrate the possibilities of the deconvolution approach. For this, let’s imagine a typical stimulus-discrimination task with two conditions (

(A) A hypothetical simple ERP experiment with overlapping responses and a non-linear covariate. Data in this figure was simulated and then modeled with

Human faces are also complex, high-dimensional stimuli with numerous properties that are difficult to perfectly control and orthogonalize in any given study. For simplicity, we assume here that the average luminance of the stimuli was not perfectly matched between conditions and is slightly, but systematically, higher for faces than houses (

Deconvolution methods for EEG have existed for some time (

In recent years, an alternative deconvolution method based on the linear model has been proposed and successfully applied to the overlap problem (

With deconvolution techniques, overlapping EEG activity is understood as the linear convolution of experimental event latencies with isolated neural responses (

The benefits of this approach are numerous: The experimental design is not restricted to special stimulus sequences, multiple regression allows modeling of an arbitrary number of different events, and the mathematical properties of the linear model are well understood.

The classic massive univariate linear model,

Here,

Furthermore, let τ be the “local time” relative to onset of the event (e.g., −100 to +500 sampling points). μ_{i,τ} is the expected (average) EEG signal measured after event _{i}_{,c}.

In contrast, with linear deconvolution we enter the continuous EEG data into the model. We then make use of the knowledge that each observed sample of the continuous EEG can be described as the linear sum of (possibly) several overlapping event-related EEG responses. Depending on the latencies of the neighboring events, these overlapping responses occur at different times τ relative to the current event (see

Linear deconvolution explains the continuous (toy) EEG signal within a single regression model. Specifically, we want to estimate the response (betas) evoked by each event so that together, they best explain the observed EEG. For this purpose, we create a time-expanded version of the design matrix (_{dc}) in which a number of time points around each event (here: only 5 points) are added as predictors. We then solve the model for b, the betas. For instance, in the model above, sample number 25 of the continuous EEG recording can be explained by the sum of responses to three different experimental events: the response to a first event of type “A” (at time point 5 after that event), by the response to an event of type “B” (at time 4 after that event) and by a second occurrence of an event of type “A” (at time 1 after that event). Because the sequences of events and their temporal distances vary throughout the experiment, it is possible to find a unique solution for the betas that best explains the measured EEG signal. These betas, or “regression-ERPs” can then be plotted and analyzed like conventional ERP waveforms. Figure adapted from

In the example in

The necessary design matrix to implement this model, _{dc}, will span the duration of the entire EEG recording. It can be generated from any design matrix

The process to create the time-expanded design matrix _{dc} is illustrated in _{dc} more formally.

Let _{i}

_{dc} can be constructed from multiple concatenated, time-expanded square diagonal matrices

The matrix _{c}

It is a scaling of an identity matrix by the scalar _{i}_{,c} which is a single entry of the design matrix _{i}_{,c} contains the continuous predictor value.

In the case of multiple predictors _{i}_{,c} and concatenate them to _{1} … _{c}

In the case of multiple instances of the same event, we have one _{dc} (where dc stands for deconvolution) by inserting the _{dc} around the time points (in continuous EEG time _{dc}β for β we effectively deconvolve the original signal.

Overview over different temporal basis functions. The expanded design matrix _{dc} is plotted, the

We usually have multiple different types of events _{1}, _{2}.… For each of these event types, we create one _{dc} we simply concatenate them along the columns before the model inversion.

Similarly, if we wanted to include a continuous covariate spanning the whole duration of the continuous EEG signal (see Discussion

The formula for the deconvolution model is then:

μ_{t} is the expected value of the continuous EEG signal _{t}

Spline regression is a method to estimate non-linear relationships between a continuous predictor and an outcome. In the simple case of a single predictor it can be understood as a type of local smoother function. For a detailed introduction to spline regression we recommend

The sum

However, due to several suboptimalities of the polynomial basis (

This basis set is constructed using the De-Casteljau algorithm (

Multiple terms can be concatenated, resulting in a GAM:

If interactions between two non-linear spline predictors are expected, we can also make use of two-dimensional splines:

In the _{1,j} and _{2,j}. Thus, a 2D spline between two spline-predictors with 10 spline-functions each would result in 100 parameters to be estimated.

The number of basis functions to use for each non-linear spline term is usually determined by either regularization or cross-validation. Cross-validation is difficult in the case of large EEG datasets (see also Discussion) and we therefore recommend to the number of splines (and thus the flexibility of the non-linear predictors) prior to the analysis.

In the previous section, we made use of time-expansion to model the overlap. For this, we used the so-called stick function approach (also referred to as FIR or dummy-coding). That is, each time-point relative to an event (local time) was coded using a 1 or 0 (in the case of dummy-coded variables), resulting in staircase patterns in the design matrix (cf.

We will discuss two examples here: The time-Fourier set (

To our knowledge, no existing toolbox supports non-linear, spline-based general additive modeling of EEG activity. Also, we were missing a toolbox that solely focuses on deconvolving signals and allowed for a simple specification of the model to be fitted, for example, using the commonly used Wilkinson notation (as also used, e.g., in

A few other existing EEG toolboxes allow for deconvolution analyses, but we found that each has their limitations. Plain linear modeling (including second-level analyses) can be performed using the LIMO toolbox (

In the following, we describe basic and advanced features available in the

The first step is to load a continuous EEG dataset into EEGLAB. This dataset should already contain event markers (e.g., for stimulus onsets, button presses, etc.). Afterwards there are four main analysis steps, that can be executed with a few lines of code (see also Box 1). These steps, highlighted in blue, are: (1) Define the model formula and let

As a start, we need a data structure in EEGLAB format (

We begin the modeling process by specifying the model formula and by generating the corresponding design matrix

More generally, we can also specify more complex formulas, such as:

Here,

Once the formula is defined, the design matrix _{dc} and now spans the duration of the entire EEG recording. Subsequently for each channel, the equation (EEG = _{dc} *

In the same linear model, we can simultaneously model brain responses evoked by other experimental events, such as button presses. Each of these other event types can be modeled by its own formula. In our face/house example, we would want to model the response-related ERP that is elicited by the button press at the end of the trial, because this ERP will otherwise overlap to a varying degree with the stimulus-ERP. We do this by defining an additional simple intercept model for all button press events. In this way, the ERP evoked by button presses will be removed from the estimation of the stimulus ERPs. The complete model would then be:

As explained earlier, many influences on the EEG are not strictly linear. In addition to linear terms, one can therefore use cubic B-splines to perform spline regression, an approach commonly summarized under the name GAM. An illustration of this approach is provided in

(A) Example of a non-linear relationship between a predictor (e.g., stimulus luminance) and a dependent variable (e.g., EEG amplitude). A linear function (black line) does not fit the data well. We will follow one luminance value (dashed line) at which the linear function is evaluated (red dot). (B) Instead of a linear fit, we define a set of overlapping spline functions which are distributed across the range of the predictor. In this example, we are using a set of six b-splines. For our luminance value, we receive six new predictor values. Only three of them are non-zero. (C) We weight each spline with its respective estimated beta value. To predict the dependent variable (EEG amplitude) at our luminance value (dashed line), we sum up the weighted spline functions (red dots). Because the splines are overlapping, this produces a smooth, non-linear fit to the observed data.

With this formula, the effect “A” would be modeled by a basis set consisting of five splines. We would also fit a 2D spline between continuous variable B and C with five splines each. In addition, we would fit a circular spline based on covariate D using five splines with the limits 0° and 360° being the wrapping point of the circular spline.

In ^{2} final predictors. For cyclical predictors such as the angle of a saccadic eye movement (which ranges, e.g., from 0 to 2π), it is possibly to use cyclical B-splines, as explained above. These are implemented based on code from

In our default B-spline implementation, the so-called

Temporal basis functions were introduced earlier. The stick-function approach, as also illustrated in

Effect of using different time basis functions on the recovery of the original signal using deconvolution. (A–C) Show three different example signals without deconvolution (in black) and with convolution using different methods for the time-expansion (stick, Fourier, spline). We zero-padded the original signal to be able to show boundary artifacts. For the analysis we used 45 time-splines and in order to keep the number of parameters equivalent, the first 22 cosine and sine functions of the Fourier set. The smoothing effects of using a time-basis set can be best seen in the difference between the blue curve and the orange/red curves in (D). Artifacts introduced due to the time-basis set are highlighted with arrows and can be seen best in (E) and (F). Note that in the case of realistic EEG data, the signal is typically smooth, meaning that ripples like in (E) rarely occur. (G) The impulse response spectrum of the different smoothers. Clearly, the Fourier-set filters better than the splines, but splines allow for a sparser description of the data and could benefit in the fitting stage.

If a predictor has a missing value in massive univariate regression models, it is typically necessary to remove the whole trial. One workaround for this practical problem is to impute (i.e., interpolate) the value of missing predictors. In the deconvolution case, imputation is even more important for a reliable model fit, because if a whole event is removed, then overlapping activity from this event with that of the neighboring events would not be accounted for. In

Linear deconvolution needs to be performed on continuous, rather than epoched data. This creates challenges with regard to the treatment of intervals that contain EEG artifacts. The way to handle artifacts in a linear deconvolution model is therefore to detect—but not to remove—the intervals containing artifacts in the continuous dataset. For these contaminated intervals, the time-expanded design matrix (_{dc}) is then blanked out, that is, filled with zeros, so that the artifacts do not affect the model estimation (

Of course, this requires the researcher to use methods for artifact correction that can be applied to continuous rather than segmented data (such as ICA). Similarly, we need methods that can detect residual artifacts in the continuous rather than epoched EEG. One example would be a peak-to-peak voltage threshold that is applied within a moving time window (shifted step-by-step across the entire recording). Whenever the peak-to-peak voltage within the window exceeds a given threshold, the corresponding interval would then be blanked out in the design matrix. Detecting artifacts in the continuous rather than segmented EEG also has some small additional benefit, because if the data of a trial is only partially contaminated, the clean parts can still enter the model estimation (

The

Solving for the betas is not always easy in such a large model. We offer several algorithms to solve for the parameters. The currently recommended one is LSMR (

However, especially with data containing a high level of noise, the tradeoff between bias and variance (i.e., between under- and overfitting) might be suboptimal, meaning that the parameters estimated from the data might be only weakly predictive for held-out data (i.e., they show a high variance and tend to overfit the data). Regularization is one way to prevent overfitting of parameter estimates. In short, regularization introduces a penalty term for high beta values, effectively finding a trade-off between overfit and out-of-sample prediction. The

Effects of regularization on deconvolving noisy data. Results of regularization are shown both for a model with stick-functions and for a model with a temporal spline basis set. (A) To create an overlapped EEG signal, we convolved 38 instances of the original signal depicted in (A). The effect of a continuous covariate was randomly added to each event (see different colors in A). To make the data noisy, we added Gaussian white noise with a standard deviation of 1. Finally, to illustrate the power of regularization, we also added another random covariate to the model. This covariate had no relation to the EEG signal but was highly correlated (

Many researchers use source reconstruction (e.g., MNE, LORETA) or blind source separation methods (e.g., ICA) to try to isolate the signal contributions of individual neural sources. In our framework, this can be understood as performing a

The

Shown are some of the figures currently produced by the

For further documentation and interactive tutorials visit

In this section, we validate the

To create simulated data, we produced overlapped data using four different response shapes, shown in the first column of

Four types of responses (first column: box car, Dirac function, auditory ERP, pink noise) were convolved with random event latencies (second column). A section of the resulting overlapped signal is shown in the third column. The fourth column shows the deconvolved response recovered by the

Finally, we will also analyze a real dataset from a single participant who performed a standard ERP face discrimination experiment^{1}

^{1}The same example data was also analyzed in our accompanying paper (

Although a central fixation cross was presented prior to each trial and participants were instructed to avoid eye movements during the task, concurrent video-based eye-tracking revealed that participants executed at least one involuntary (micro)saccades during the vast majority of trials (see also

This means that our stimulus-locked ERPs are contaminated with two other processes: visually-evoked potentials (lambda waves) generated by the retinal image motion produced by the (micro)saccades (

To disentangle these potentials with

(A) Panel adapted from

Human behavior in natural environments is characterized by complex motor actions and quasi-continuous, multisensory stimulation. Brain signals recorded under such conditions are characterized by overlapping activity evoked by different processes and typically also influenced by a host of confounding variables that are difficult or impossible to orthogonalize under quasi-experimental conditions. However, even in traditional, highly controlled laboratory experiments, it is often unrealistic to match all stimulus properties between conditions, in particular if the stimuli are high-dimensional, such as words (e.g., word length, lexical frequency, orthographic neighborhood size, semantic richness, number of meanings, etc.) or faces (e.g., luminance, contrast, power spectrum, size, gender, age, facial expression, familiarity, etc.). In addition, as we demonstrate here, even simple EEG experiments often contain overlapping neural responses from multiple different processes such as stimulus onsets, eye movements, or button presses. Deconvolution modeling allows us to disentangle and isolate these different influences to improve our understanding of the data.

In this article, we presented

Linear deconvolution can be applied to many types of paradigms and data. As shown above, one application is to separate stimulus- and response-related components in traditional ERP studies (see also

It is also possible to add time-continuous signals as predictors to the design matrix (

A fundamental assumption of traditional ERP averaging is that the shape of the underlying neural response is identical in all trials belonging to the same condition. Trials with short and long manual RTs are therefore usually averaged together. Similarly, with linear deconvolution modeling, we assume that the brain response is the same for all events of a given type. However, like in traditional ERP analyses, we also assume that the neural response is independent of the interval between two subsequent events (e.g., the interval between a stimulus and a manual response). This is probably a simplification, since neural activity will likely differ between trials with a slow or fast reaction.

A related assumption concerns sequences of events: processing one stimulus can change the processing of a following stimulus, for instance due to adaptation, priming, or attentional effects. We want to note that if such sequential effects occur often enough in an experiment, they can be explicitly modeled; for example, on could add an additional predictors coding whether a stimulus is repeated or not or whether it occurred early or late in a sequence of stimuli. We hope that the

Non-linear predictors can have considerable advantages over linear predictors. However, one issue that is currently unresolved is how to select an appropriate number of spline functions to model a non-linear effect without under- or overfitting the data. While automatic selection methods exist (e.g., based on generalized cross-validation,

While all example analyses presented here were conducted in the time domain, it is also possible to model and deconvolve overlapping time-frequency representations with

In a traditional ERP analysis, the researcher has to set numerous analysis parameters, which will influence the final ERP results (e.g., filter settings, epoch length, baseline interval). Similarly, with deconvolution we have to make these and additional choices. Because the deconvolution approach is still in its infancy, there are not yet clearly established best-practices for all of the necessary settings. However, in the following, we discuss some basic and advanced parameters:

Basic settings/options

Advanced settings/options

_{splines}-2) quantiles of the predictor. Other placements are possible, and users can specify their own placements if necessary (

_{dc}. As described above, it is also possible to use a set of

In recent years, linear mixed-effects models (LMM, e.g.,

Finally, it is also possible to model other types of overlapping psychophysiological signals with

In summary,

The authors declare that they have no competing interests.

The following information was supplied relating to ethical approvals (i.e., approving body and any reference numbers):

For this methods paper, we reanalyzed data of a single subject from an ERP dataset that was previously published elsewhere (in:

The following information was supplied regarding data availability:

Data is available at the Open Science Foundation (