Masked timesteps in spiking neural networks for efficient time series classification

Shiyong Geng; Changhao Sun; Yongping Dan; Zhanyang Zhang; Hengyi Li; Lin Meng

doi:10.7717/peerj-cs.3460

Masked timesteps in spiking neural networks for efficient time series classification

Shiyong Geng¹, Changhao Sun¹, Yongping Dan ¹, Zhanyang Zhang¹, Hengyi Li^2,3, Lin Meng⁴

1School of Integrated Circuits, Zhongyuan University of Technology, Zhengzhou, Henan, China

2School of Automation and Electrical Engineering, Zhongyuan University of Technology, Zhengzhou, Henan, China

3Research Organization of Science and Technology, Ritsumeikan University, Kusatsu, Shiga, Japan

4College of Science and Engineering, Ritsumeikan University, Kusatsu, Shiga, Japan

DOI: 10.7717/peerj-cs.3460

Published: 2026-01-29
Accepted: 2025-11-18
Received: 2025-02-18

Academic Editor: Giulia Cisotto

Subject Areas: Data Mining and Machine Learning, Data Science, Neural Networks
Keywords: Spiking neural network, Univariate time series classification, Direct encoding

Copyright: © 2026 Geng et al.
Licence: This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.

Cite this article: Geng S, Sun C, Dan Y, Zhang Z, Li H, Meng L. 2026. Masked timesteps in spiking neural networks for efficient time series classification. PeerJ Computer Science 12:e3460 https://doi.org/10.7717/peerj-cs.3460

The authors have chosen to make the review history of this article public.

Abstract

Spiking neural networks (SNN) accurately emulate the neurodynamics of biological neurons, theoretically enhancing the processing capabilities for time series data by using spike trains. Existing research, however, has largely remained in an exploratory stage, focusing primarily on the potential of SNNs in time series forecasting rather than classification. Current research on Time Series Classification (TSC) using SNNs often employs complex, pre-trained encoding schemes, relies on reservoir computing, or depends merely on spiking neurons for extracting time series information without fully integrating spiking neurons into the network architecture. Presently, there is a significant gap in SNN research regarding a universally applicable methodology for TSC that balances low temporal complexity with effective, straightforward implementation, trained from scratch, and robust biological plausibility. This article introduces the Masked Timestep SNN (MT-SNN) architecture to selectively mask low neuronal activity timesteps, thereby addressing the entrenched issue of temporal redundancy in SNNs. In addition, our research validates the effectiveness of the direct encoding strategy in SNNs for TSC and proposes the Temporal Adaptive Integrate-and-Fire (TAIF) neuron model, which improves its temporal dynamics through mechanisms of threshold adjustment and fatigue. Our experiments on the UCR Time Series Classification Archive indicate that our approach achieves performance comparable or superior to traditional machine learning and deep learning methods at ultra-low timesteps, obviating the necessity for specialized encoding schemes, preprocessing, or feature engineering. To the best of our knowledge, our approach has achieved the State-of-The-Art (SoTA) spiking result in univariate TSC tasks while maintaining simplicity and biological plausibility, offering a valuable and comprehensive spiking baseline.

Introduction

Time Series Classification (TSC) constitutes a critical area of focus within data mining research, with wide-ranging applications in finance, biomedical signal processing (Gong et al., 2023), and industrial monitoring. The main challenge in TSC is effectively identifying and utilizing significant features in time series data. Traditional approaches, including Dynamic Time Warping (DTW) (Keogh & Ratanamahatana, 2005), Bag-of-SFA-Symbols (BOSS) (Schäfer, 2015), the Collective of Transformation-based Ensembles (COTE) (Bagnall et al., 2015), along with Time Series Bag-of-Features (TSBF) (Baydogan, Runger & Tuv, 2013), serve as foundational techniques. Nevertheless, these methodologies still face difficulties related to high computational costs, lengthy processing durations, complexities in parameter fine-tuning, restricted scalability, and inadequate noise robustness (Bagnall et al., 2017). With the rise of Deep Learning (DL) techniques, DL-based TSC methods address the shortcomings of traditional methods (Ren et al., 2024), classified into generative and discriminative approaches (Fawaz, 2020). Generative models, represented by Autoencoders and Echo State Networks (ESN), contrast with discriminative models, which are subdivided into feature engineering and end-to-end approaches. The latter discriminative methods include Multilayer Perceptrons (MLP), Convolutional Neural Networks (CNN), and hybrid methods. The automated feature extraction capability of DL-based TSC models has simplified data processing tasks compared to traditional TSC methods. However, conventional DL methods for TSC tasks, exemplified by Artificial Neural Networks (ANN), particularly CNNs, confront difficulties involving poor interpretability, high energy consumption, and substantial training data requirements (Li et al., 2023a; Thompson et al., 2021; Zhang et al., 2021; Li et al., 2023b). These constraints limit their adaptability in scenarios where decision transparency and edge computing are crucial.

To overcome the limitations of traditional DL methods in TSC, Spiking Neural Networks (SNN) not only signify a paradigm shift towards systems that mimic biological neural systems but also present a significant advance in terms of their superior biological interpretability and event-driven capabilities (Roy, Jaiswal & Panda, 2019; Deng et al., 2020; Faghihi, Cai & Moustafa, 2022). Notably renowned for these features, SNNs provide a robust alternative with energy efficiency advantages and have been successfully implemented on various neuromorphic hardware platforms such as TrueNorth, Loihi, and SpiNNaker (Akopyan et al., 2015; Davies et al., 2018; Mayr, Hoeppner & Furber, 2019). Additionally, SNNs using unsupervised learning strategies have shown potential in few-shot learning (Yang, Linares-Barranco & Chen, 2022), as they are able to extract discriminative features even with very limited labeled data (Dong et al., 2023). These features make SNNs an ideal solution for TSC tasks.

Currently, SNNs for Time Series Classification (TSC-SNN) architecture have not been widely adopted because spikes are non-differentiable, hindering the use of standard backpropagation algorithms (Wang, Lin & Dang, 2020). Moreover, whether to use spike encoding, the choice of specific encoding strategies, the size of the time window, and the design of the network architecture represent pressing issues that current research must resolve.

Existing TSC-SNN research either relies on complex encoding schemes, necessitates pre-trained encoders or particular data preprocessing, and requires massive parameter tuning to find optimal configurations or extract features by convolutional layers instead of spiking neurons. These methods, which either increase complexity or rely heavily on CNNs, contradict the principles of energy efficiency and biological plausibility.

This article proposes the Masked Timestep Spiking Neural Network (MT-SNN) architecture for univariate TSC tasks. Firstly, we introduce a residual structure with variable timesteps into the TSC-SNN. Compared to previous works and other encoding methods, our approach using direct encoding features ultra-low and variable timesteps, resulting in significant energy efficiency and a reduced need for complex parameter tuning, requiring only the essential timestep settings. Based on variations in spike firing rates within these timesteps, this mechanism selects the timesteps with the highest excitability and transfers them to the subsequent spiking residual block, effectively implementing a form of time-shrunk residual processing. Furthermore, we developed the Temporal Adaptive Integrate-and-Fire (TAIF) model, which adaptively adjusts thresholds based on the current timestep and incorporates a fatigue mechanism to prevent excessive spike firing.

Our architecture has been rigorously evaluated on the UCR Time Series Classification Archive. The results show that the MT-SNN performs better than traditional and DL models in univariate TSC tasks and offers considerable advantages in computational cost and energy consumption by using significantly reduced timesteps. More importantly, the MT-SNN architecture achieves these results using unified TAIF hyperparameter and low timestep settings, without heavy preprocessing, parameter tuning, or special encoding. This advancement in TSC-SNN models makes it a viable option in scenarios where computational resources are limited. The reduced-timestep residual structure of the SNN integrates energy efficiency, adaptability, and high performance, providing a robust baseline for TSC-SNN. To our knowledge, the MT-SNN achieves State-of-The-Art (SoTA) results in TSC-SNN with superior spike rationality.

In this article, we outline our contributions below:

1.

Prove that direct encoding, unlike the previous TSC-SNN models that relied on specific encoding, possesses efficiency and simplicity.
2.

Propose a novel Residual SNN architecture with variable timesteps, the MT-SNN.
3.

Introduce a spiking neuron model with a fatigue mechanism, the TAIF, tailored for TSC tasks.
4.

Conduct a comprehensive evaluation of our model on 49 univariate UCR Time Series Classification Archive datasets, achieving SoTA results across all TSC-SNN benchmarks with ultra-low timesteps (up to 5).

The article is organized as follows. ‘Related Work’ provides an exhaustive introduction to previous TSC-SNN methods. ‘Method’ details the specifics of the MT-SNN and TAIF, demonstrating how the masked timesteps and the TAIF fatigue mechanism are implemented. ‘Experiments’ presents detailed experimental results and comparative analyses. ‘Conclusion’ concludes the article and discusses future work.

Related work

This section provides a full review of the research, focusing on two major aspects related to our work: the application of Spiking Neural Networks for Time Series Classification and the existing issues of Temporal Redundancy in SNNs.

Spiking neural networks for time series classification

Regarded as the third generation of neural networks (Maass, 1997), SNNs differ from traditional ML and DL models by emphasizing biological plausibility and energy efficiency. Furthermore, the intrinsic ability of spiking neurons to capture temporal dependencies makes SNNs particularly well-suited for TSC applications in embedded systems, wearable devices, and remote monitoring.

Current methodologies for TSC-SNN are divided into two categories, as illustrated in Fig. 1: reservoir-based methods and deep learning-based methods. The reservoir-based methods include Reservoir Computing (RC) and Echo State Networks (ESN). The deep learning-based methods are further divided into ANN2SNN (ANN to SNN) conversions (Rueckauer et al., 2017) and models trained from scratch (Wu et al., 2019; Sam et al., 2023).

Figure 1: Current methods in TSC-SNNs.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-1

Reservoir computing, specifically ESN, which uses a particular Recurrent Neural Network (RNN) as the reservoir, simplifies training by requiring only adjustments to the weights from the reservoir to the output layer. The dynamic properties of the reservoir efficiently capture the temporal dependencies of the input series. Peng et al. (2024) developed two RC variants for multivariate TSC tasks. Dey et al. (2022) proposed a spike-based reservoir architecture for TSC with two spike-encoding methods to classify vibration time series datasets, and demonstrated that temporal encoding is more effective for TSC than rate encoding. Zhang, Zhu & Li (2018) proposed the Spiking Echo State CNN (SES-CNN), which uses a specialized temporal encoding technique to transform time series data into multichannel spike train signals. The encoded features are extracted using the reservoir computing approach with spiking neurons. The spikes are then converted into 2D mapping images, facilitating classification through a CNN (Zhang, Zhu & Li, 2018).

Although RC, particularly ESN, has advantages in scenarios that are not too complex and require adaptive training, advances in deep learning have overshadowed some of the benefits that RC once offered.

An alternative method to TSC-SNNs uses deep learning techniques, including models converted from ANN as well as those trained from scratch. Yan, Zhou & Wong (2021) introduced a method for energy-efficient ECG classification using an ANN2SNN-based TSC-SNN model. However, the ANN2SNN method requires prolonged timesteps for conversion and imposes constraints such as setting biases to zero and absorbing batch normalization layers (Su et al., 2023). Moreover, the information extraction process is still conducted by well-trained ANNs, which contradicts the biological plausibility of SNNs.

To address this issue, directly-trained SNNs have gained popularity. Fang, Shrestha & Qiu (2020) first applied SNNs to multivariate TSC tasks. They proposed an encoding scheme that converts time series into sparse spatio-temporal spike patterns. Then an iterative SNN model was derived from the Spike Response Model (SRM), enabling the use of Back Propagation Through Time (BPTT) within SNNs (Fang, Shrestha & Qiu, 2020). Gautam & Singh (2020) proposed a hybrid CSNN architecture that uses a differentiable spike generated by the Soft-LIF model to facilitate backpropagation. However, using Soft-LIF compromises the most distinctive feature of SNNs, the binary spike. Gaurav, Stewart & Yi (2023) proposed two TSC-SNN models, inspired by the theories of reservoir computing networks and Legendre Memory Units: one based on RC and another trained using surrogate gradient methods. These models demonstrate the potential of SNNs in temporal data for performance and energy efficiency by nonlinearly decoding the linearly extracted features through spiking neurons (Gaurav, Stewart & Yi, 2023).

Though previous TSC-SNN models have shown some progress, they still encounter certain deficiencies. Some models rely heavily on convolutional layers for feature extraction and consist of only single spike neuron layers, without fully replacing all ReLU units with spiking neurons. Others require complex preprocessing of the original data or the use of sophisticated pre-training encoders. Moreover, some models need multiple sets of reservoir parameters, show limited performance, or have been tested only on specific datasets.

Temporal redundancy in spiking neural networks

Recent studies have pointed out the temporal redundancy in SNNs. Kim et al. (2023) discovered a phenomenon called Temporal Information Concentration (TIC) during the training of SNNs, which indicates the concentration of key information in the early timesteps. As training stabilizes, the sensitivity of the model to new information decreases, leading to great temporal redundancy in later training stages (Kim et al., 2023). Li, Jones & Furber (2023) employed Dynamic Confidence to optimize the latency of inference by dynamically determining when to terminate inference, thereby reducing timesteps.

Several studies are dedicated to overcoming temporal redundancy by reducing timesteps. Li et al. (2024) proposed the Spiking Early-Exit Neural Networks (SEENN), treating the number of timesteps as a variable conditioned on the input samples and using two methods to determine the exit timestep. They also proposed Dynamic Timestep SNN (DT-SNN), which dynamically determines the number of timesteps during inference (Li et al., 2023c). Ding et al. (2024) proposed Shrinking SNN (SSNN), which divides the SNN into multiple stages with progressively shrinking timesteps, and uses a temporal transformer to transform temporal scale of information. Chowdhury, Rathi & Roy (2021b) proposed an Iterative Initialization and Retraining method for SNNs (IIR-SNN) to perform single-shot inference along the temporal axis as the network gradually shrinks in the temporal domain. In particular, they also introduced spatial and temporal pruning and quantization of SNNs, with temporal pruning performed by gradually reducing simulation timesteps during training (Chowdhury, Garg & Roy, 2021a). Suetake et al. (2023) proposed a single-step SNN neuron model, demonstrating its practicality in static image classification using the surrogate gradient method.

The studies above indicate that reducing the number of timesteps effectively optimizes temporal redundancy, indicating considerable potential improvements in edge scenarios that require low latency for TSC. Based on these findings, we propose the MT-SNN architecture, which selectively masks timesteps to ensure that the network efficiently processes temporal information at various stages of training.

Method

In this section, we outline the architecture of MT-SNN, detail the encoding methods employed, and describe the specific network layers. Next, we introduce the implementation of the Masked Timestep, followed by a discussion of the TAIF neuron.

Network structure

Encoding methods

In the context of TSC tasks, a primary challenge is to encode complex and variable time series into spike trains using SNNs without incurring information loss. Rate encoding can approximate the original information without significant loss only in very large timesteps (Kim et al., 2018), but excessive inference time reduces energy efficiency, conflicting with the requirements for low power consumption in TSC tasks. Similarly, latency and temporal encoders face challenges related to parameter tuning or the need for additional pretraining (Chen et al., 2011). Poisson encoding introduces stochasticity by converting input intensities into spike probabilities, which will reduce reproducibility and incur additional computational cost (Rathi & Roy, 2021). Significantly, these methods typically do not achieve high performance. Due to its simplicity, direct encoding has been widely used in tasks such as image recognition (Kim et al., 2022), maintaining the integrity of the original sequence without extensive preprocessing. As one of the most effective methods for classification tasks, direct encoding can even achieve a performance comparable to ANNs (Wu et al., 2019; Zheng et al., 2021). Therefore, in the MT-SNN architecture, direct encoding is used, as illustrated in Fig. 2.

Spiking residual block

Using direct encoding, the original time series can be transformed into spike trains suitable for an SNN. After encoding, these spike trains are sent to the remainder of the MT-SNN. Recent research strongly supports the superiority of ResNet architectures for univariate TSC tasks. Therefore, we have incorporated the residual network into the architecture. As demonstrated in previous work, we initially replaced all ReLU units in the ResNet architecture with spiking neurons, obtaining a preliminary Spiking Residual Block structure (Han et al., 2022; Fang et al., 2021; Hu et al., 2021; Sengupta et al., 2019).

Structure of Spiking ResNet

To maximize the combination of the advantages of SNN and residual structures, average pooling, which is more consistent with the principles of SNNs, was selected as the pooling method. The classifier structure involves multiple layers including adaptive average pooling, flatten, dropout, and linear layers with spiking layers to implement the spiking mechanism. The general structure of the basic Spiking Resnet is shown in Fig. 3.

Figure 3: The general structure of the basic residual SNN, obtained by replacing the ReLU with spiking neurons.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-3

Masked timesteps

Although Spiking ResNet, obtained by directly replacing the ReLU activation function, is capable of addressing certain challenges in univariate TSC tasks compared to traditional DL and ML methods, it still encounters temporal redundancy issues. Since spiking neurons are highly correlated with the added time dimension T, in the context of direct encoding, T also represents the number of times the original data are repeated along the time axis. A larger T allows for more refined training but also leads to an increased network depth, resulting in additional computational costs. However, given the wide range of TSC data, selecting an appropriate T for each dataset is not a straightforward task. This is also why we use direct encoding to expand the time dimension rather than directly using the original data length as the time dimension. Inspired by studies on temporal redundancy that indicate that only a subset of the timesteps is significant during the training process of SNNs, we apply a masking technique to the timesteps in the Spiking Residual Blocks, thereby retaining only those timesteps that are important.

Univariate TSC modeling

Generally, the raw time series, denoted as $x = {x_{1}, x_{2}, \dots}$ , comprises a single-channel sequence with a length of L and is associated with a class label $c$ . Univariate TSC tasks involve assigning class labels to sequences of time series data using a supervised classifier. To make $x$ suitable for Spiking Residual Block, the input must be transfromed into $X \in R^{T \times B \times C \times L}$ , where T is the number of timesteps, B is the batch size, C is the number of channels, and L is the length of each univariate time series.

Spiking neuron integration

The spiking neuron layer, $S N (\cdot)$ , is used to replace the ReLU activation in the residual block. Subsequently, the direct encoded input, X, is fed into the Spike Encoder to generate spikes, $S_{t}$ .

This process is outlined as follows:

(1) $S_{t} = S N (B N (C o n v (X))), t = 1, \dots, T .$

Following this, the $S_{t}$ is sent to the remaining parts of Block, obtaining the output $O_{t}$ for each timestep.

Temporal selection

In order to reduce temporal redundancy, we apply the mask M to selectively disable certain unimportant timesteps. The criterion for measuring their contribution is the firing frequency FR. The temporal reduction between Spiking Residual Blocks can be seen in Fig. 4. According to the Hebbian learning rule, a greater number of spikes indicates more substantial learning, as the frequency of these spikes characterizes the intensity of the learning process.

Figure 4: The overview of masked timestep in Spiking Residual Blocks, the more detailed processing is in Algorithm 1.
In the figure, $T_{1} = 5$ and $T_{2} = 3$ as the first 2 timesteps are masked due to low neural activaty.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-4

To calculate FR, we concatenate outputs along the time dimension to obtain the concat output $O = \oplus_{t = 1}^{T} {\tilde{o}}_{t}$ .

Due to the binary nature of spiking neurons, the firing rate (FR) at timestep $t$ is obtained by summing $S_{t}$ across the channel and feature dimensions. This can be expressed as follows:

(2) $F R = \sum_{c, l} O_{t, c, l}, t = 1, \dots, T .$

In this way, $F R_{t}$ is a scalar per timestep and per sample; for a batch of size B, $F R$ forms a matrix of size $B \times T$ . The mask M is then applied to the output of each timestep:

(3) ${\tilde{O}}_{t} = O_{t} ⊙ M, t = 1, \dots, T$ with broadcasting along the $(B, C, L)$ dimensions, where $⊙$ denotes element-wise multiplication.

Afterward, the network retains the timesteps with high firing rates based on the predefined number of timesteps to be kept. The output, ${\tilde{O}}_{t}$ , is then fed into the next Spiking Residual Block.

The general details of applying the masked timestep method are depicted in Algorithm 1.

Algorithm 1 :

Masked timestep processing.

Parameters:

B_i: The i-th Spiking Residual Block,

i = 1, 2, . . ., N

N: Total number of Spiking Residual Blocks

T_i: Number of current timesteps processed by B_i

T_n: Number of next timesteps processed by

B_{i + 1}

M: Binary mask indicating active timesteps

M \in {0, 1}^{T \times B \times C \times L}

FR: Firing rates of the temporal dimension, defined as the average number of spikes per unit time

Input:

x : Input time series

Output:

\hat{y}

: Final classification output

1: Reshape input x to

X \in R^{T \times B \times C \times L}

in order to fit direct encoding

2: Initialize an all True mask M with the same shape as X

3: Set initial T_i as the number of timesteps for the first block

4: for B_i;

i \leq N; i + +

5: Get next number of timesteps T_n for

B_{i + 1}

6: if T_n is None then

7: Process X without temporal masking

8: else

9: Clone current mask M to create

M_{c u r r e n t}

10: for

t = 1, 2, . . ., T_{i}

11: Compute block output O_t using the Spiking Residual Block B_i

12: Apply mask

M_{c u r r e n t}

to O_t

13: end for

14: Compute firing rates FR based on the block outputs

15: Sort timesteps in T_i according to FR

16: Select top T_n timesteps:

i d x_t o p = i d x_s o r t e d [0 \dots T_{n} - 1]

17: Create new mask M such that

M [i d x_t o p, :] = 1

, others = 0

18: end if

19: Pass masked output through avgpooling layer

20: Update T_i with T_n for the next block

21: end for

22: Pass the final output through the classifier layers

23: Average the output y_t across the time dimension to get the final classification output

\hat{y} = \frac{1}{T_{i}} \sum_{t = 1}^{T_{i}} y_{t}

24: Return

\hat{y}

DOI: 10.7717/peerj-cs.3460/table-101

TAIF neuron

In this section, the proposed Temporal Adaptive Integrate-and-Fire (TAIF) neuron is detailed.

Overview of spiking neurons

The introduction of spiking neuron models has been a significant advancement in neural network fields, providing a more accurate representation of the information processing mechanisms in biological neurons. The activities of all spiking neurons can be simply summarized into the following three aspects:

(4) $H_{t} = f (V_{t - 1}, X_{t})$

(5) $S_{t} = Θ (H_{t} - V_{t h r e s h o l d})$

(6) $Θ (x) = {\begin{matrix} 1, & x \geq 0 \\ 0, & x < 0 \end{matrix}$ where $Θ (x)$ is the Heaviside function.

Equations (4) and (5) represent the charge and fire behaviors of neurons, respectively, while reset behavior can be categorized into two types: soft and hard, as shown in Eqs. (7) and (8).

(7) $V_{t} = H_{t} - V_{t h r e s h o l d} \cdot S_{t}$

(8) $V_{t} = H_{t} \cdot (1 - S_{t}) + V_{r e s e t} \cdot S_{t} .$

In the aforementioned equations, the subscript $t$ denotes time, $X_{t}$ denotes input, $H_{t}$ denotes the membrane potential after charging and before the spike fires, and $V_{t}$ denotes the membrane potential after a spike has been fired. $f (V_{t - 1}, X_{t})$ is the function used to update the state of the neuron. In TSC tasks, since the sequences are temporally continuous, the historical states of the neurons should not be reset to $V_{r e s e t}$ . Therefore, the TAIF neuron model uses soft reset.

Training strategy

One major challenge in training SNNs is the non-differentiability of spikes. In this article, we choose to use the surrogate gradient method to directly train SNNs with the BPTT method, using a differentiable surrogate function $g (x)$ to replace $Θ (x)$ during backpropagation (Neftci, Mostafa & Zenke, 2019; Lee, Delbruck & Pfeiffer, 2016). The surrogate gradient technique performs excellently, allowing for easy integration into modern deep learning frameworks and requiring neither a large number of timesteps nor extensive tuning.

In MT-SNN, for all spiking neurons, we approximate the derivative of $Θ (x)$ as

(9) $g^{'} (x) = \frac{α}{2 (1 + {(\frac{π}{2} k x)}^{2})}$ where $g^{'} (x)$ is the derivative of

(10) $g (x) = \frac{1}{π} a r c t a n (\frac{π}{2} k x) + \frac{1}{2} .$

Here, $k$ is a hyperparameter used to control the shape of the function. To prevent vanishing or exploding gradients, we use $k = 2$ , which ensures that the maximum value of the gradient of the surrogate function is 1. The specific shape of the surrogate function is illustrated in Fig. 5.

Figure 5: The surrogate gradient method.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-5

In SNNs, commonly used spiking neurons include IF neurons and Leaky Integrate-and-Fire (LIF) neurons (Abbott, 1999). However, in TSC tasks, the leakage characteristics of LIF neurons often lead to suboptimal performance compared to IF neurons, making the IF neuron more suitable. However, the reduction in time steps in MT-SNN causes standard IF neurons to lack awareness of this change, leading to redundant processing steps. To address this issue, we propose an improved IF neuron, termed the Temporal Adaptive Integrate-and-Fire (TAIF) neuron, inspired by biological neurons. We introduce a fatigue mechanism into IF neurons, making them more prone to fatigue in the early training stages with long timesteps, while more excitable in the later stages of short timesteps. This behavior aligns with the TIC phenomenon and optimizes the temporal information distribution, enhancing their robustness and performance.

The charge function of the TAIF neuron is the same with standard IF neuron:

(11) $\frac{d V_{t}}{d t} = I_{t} .$

In simulation, we use discrete difference equations to approximate continuous differential equations:

(12) $H_{t} = V_{t - 1} + X_{t} .$

To ensure the simplicity of IF neurons, the improvements in TAIF focus on the fire and reset behaviors. First, the TAIF charges according to the charging Eq. (12). Then, it determines whether $H_{t}$ has reached the threshold potential $V_{t h}$ . If $H_{t}$ exceeds $V_{t h}$ , a spike fired and the fatigue value $F_{t}$ increases by $γ$ and without recovery behavior, where $γ$ denotes the fatigue growth factor. If $H_{t}$ does not reach $V_{t h}$ , the neuron does not fire spike, and the fatigue value $F_{t}$ decreases by $δ$ to simulate fatigue recovery behavior, where $δ$ denotes the fatigue recovery rate. The accumulated fatigue value then directly affects the threshold potential $V_{t h}$ . The current timestep T is used as a coefficient for $F_{t}$ , allowing the model to capture temporal changes. The update process of TAIF can be seen in Algorithm 2.

Algorithm 2 :

TAIF neuron update logic.

Input: Output from convolutional and batch normalization layers at timestep t,

X_{c b}^{t}

Output: Spike train

S_{t}

1: Initialize

γ_{t}

and

δ_{t} = 0

2: for

t = 1

to T do

3: Neuron Charge:

4: Update membrane potential H_t based on input

X_{c b}^{t}

5: if

H_{t} \geq V_{t h}

then

6: Neuron Fire:

f_{t} = γ_{t}

8: Fatigue factor update:

F_{t} = F_{t - 1} + f_{t}

9: Neuron Reset:

10:

V_{t} = H_{t} - V_{t h_{t}} \cdot S_{t}

11: else

12: No spike fired

13:

f_{t} = - δ_{t}

14: Decrease fatigue factor F_t and ensure it does not fall below 0:

F_{t} = max (F_{t - 1} + f_{t}, 0)

15: end if

16: Update threshold potential:

V_{t h_{t}} = V_{t h_{t - 1}} + F_{t} \cdot t

17: end for

18: Return output spike train

S_{t}

DOI: 10.7717/peerj-cs.3460/table-102

In this way, we can directly train the spiking neuron layers using the BPTT method.

First, we define the FPTT function as follows:

(13) $S_{1, . . ., T}, V_{1, . . ., T} = F_{f p} (X_{1, . . ., T}, V_{0}, F_{0}) .$

So the backward function is:

(14) $\frac{d L}{d X_{1, \dots, T}}, \frac{d L}{d V_{0}}, \frac{d L}{d F_{0}} = F_{b p} (\frac{\partial L}{\partial S_{1, \dots, T}}, \frac{\partial L}{\partial V_{1, \dots, T}}, \frac{\partial L}{\partial F_{1, \dots, T}}) .$

Based on the TAIF update process, the computation graph is shown in Fig. 6.

Figure 6: The computation graph of TAIF in time dimension, the dotted line represents the situation when the neuron is not firing.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-6

The gradient of TAIF is as follows:

(15) $\frac{d L}{d H_{t}} = \frac{\partial L}{\partial S_{t}} \cdot \frac{d S_{t}}{d H_{t}} + (\frac{\partial L}{\partial V_{t}} + \frac{d L}{d H_{t + 1}}) \cdot \frac{d V_{t}}{d H_{t}}$

(16) $\frac{d L}{d X_{t}} = \frac{d L}{d H_{t}} \cdot \frac{d H_{t}}{d X_{t}} = \frac{d L}{d H_{t}} .$

While $t$ = 0:

(17) $\frac{d L}{d V_{0}} = \frac{d L}{d H_{1}} \cdot \frac{d H_{1}}{d V_{0}} = \frac{d L}{d H_{1}} .$

The gradient of the loss function L to the threshold voltage $V_{t h_{t}}$ at time $t$ is:

(18) $\frac{d L}{d V_{t h_{t}}} = \frac{\partial L}{\partial S_{t}} \cdot \frac{\partial S_{t}}{\partial V_{t h_{t}}} + \frac{\partial L}{\partial V_{t}} \cdot \frac{\partial V_{t}}{\partial V_{t h_{t}}} + \frac{\partial L}{\partial V_{t h_{t + 1}}} \cdot \frac{d V_{t h_{t + 1}}}{d V_{t h_{t}}}$ where:

(19) ${\begin{aligned} \frac{\partial S_{t}}{\partial V_{t h_{t}}} & = Θ^{'} (H_{t} - V_{t h_{t}}) (- 1) \\ \frac{\partial V_{t}}{\partial V_{t h_{t}}} & = - S_{t} \\ \frac{d V_{t h_{t + 1}}}{d V_{t h_{t}}} & = 1 \end{aligned} .$

Therefore, the complete formula is:

(20) $\frac{\partial L}{\partial V_{t h_{t}}} = - \frac{\partial L}{\partial S_{t}} \cdot Θ^{'} (H_{t} - V_{t h_{t}}) - \frac{\partial L}{\partial V_{t}} \cdot S_{t} + \frac{\partial L}{\partial V_{t h_{t + 1}}} .$

So we can calculate the gradients for $F_{t}$ :

(21) $\frac{\partial L}{\partial F_{t}} = \frac{\partial L}{\partial F_{t + 1}} \cdot \frac{d F_{t + 1}}{d F_{t}} + \frac{\partial L}{\partial V_{t h_{t}}} \cdot \frac{d V_{t h_{t}}}{d F_{t}} .$

Experiments

Unlike previous TSC-SNN works that were tested only on specific datasets, we conducted a comprehensive comparison using 49 univariate UCR datasets evaluated in related traditional, DL and TSC-SNN studies. Our model achieved SoTA performance compared to previous SNN models.

Parameter settings

The CSNN architecture consists of three stacked Spiking Residual Blocks followed by a classifier. Each SRB comprises two convolutional layers, batch normalization, and TAIF neuron layers. The output of each SRB is passed through an average pooling layer for downsampling. The detailed settings of the network architecture are in Table 1.

Table 1:

Network parameters.

Layer	Description
$S R B_{1} (T_{3} K_{9})$	SRB with timestep $T_{3}$ and kernel size $k = 9$
$S R B_{2} (T_{3} K_{7})$	SRB with timestep $T_{2}$ and kernel size $k = 7$
$S R B_{3} (T_{1} K_{5})$	SRB with timestep $T_{1}$ and kernel size $k = 5$
$A v g P_{1} (K_{2})$	1D average pooling layer with kernel size $k = 2$
$A d a A v g P (4)$	Adaptive 1D average pooling with output 4
$F L$	Flatten layer
$D P (0.5)$	Dropout layer with probability $p = 0.5$
$L N$	Linear (fully-connected) layer
$I F$	Integrate-and-Fire neuron layer

DOI: 10.7717/peerj-cs.3460/table-1

In all our tests, we used the same network structure to demonstrate the versatility of MT-SNN, and employed the same seed to ensure reproducibility. For optimization, we used Adam Optimizer with CyclicLR Scheduler (Kingma & Ba, 2017). The learning rate for Adam was set to 0.001. To encourage correct firing of neurons while silencing the others, we use MSE as the loss function.

To validate the generalizability of the MT-SNN, we adjusted only the batch size for different datasets, making it the sole hyperparameter that required adjustment, an advantage not achieved in previous works. In practical applications, one only needs to set the desired timestep reduction number. To further demonstrate the effectiveness of the proposed method, we used two timestep reduction settings (maximum timesteps of 5 or 3) in our evaluation. The evaluation results indicate that the MT-SNN achieves remarkable performance even under ultra-low timesteps (3-2-1). Detailed settings can be found in Table 2.

Table 2:

Batch sizes and timestep reductions for different datasets.

Datasets	Classes	Batch	Timestep reduction
Adiac	37	256	5-3-1
Beef	5	16	3-2-1
CBF	3	16	3-2-1
ChlorineCon	3	64	5-3-1
CinCECGTorso	4	40	5-3-1
Coffee	2	28	3-2-1
CricketX	12	256	5-3-1
CricketY	12	256	5-3-1
CricketZ	12	256	5-3-1
DiatomSizeR	4	16	5-3-1
Earthquakes	2	64	5-3-1
ECG200	2	64	5-3-1
ECG5000	5	16	3-2-1
ECGFiveDays	2	64	3-2-1
FaceAll	14	64	3-2-1
FaceFour	4	24	3-2-1
FacesUCR	14	200	5-3-1
FiftyWords	50	256	5-3-1
Fish	7	32	3-2-1
FordA	2	50	5-3-1
FordB	2	50	5-3-1
GunPoint	2	50	3-2-1
Haptics	5	64	3-2-1
InlineSkate	7	16	5-3-1
ItalyPower	2	64	3-2-1
Lightning2	2	32	3-2-1
Lightning7	7	32	3-2-1
MALLAT	8	55	3-2-1
MedicalImages	10	381	3-2-1
MoteStrain	2	20	3-2-1
NonInvThorax1	42	32	3-2-1
NonInvThorax2	42	32	3-2-1
OliveOil	4	30	3-2-1
OSULeaf	6	128	3-2-1
SonyAIBO	2	20	3-2-1
SonyAIBOII	2	20	3-2-1
StarLightCurves	3	64	3-2-1
SwedishLeaf	15	128	3-2-1
Symbols	6	25	3-2-1
SyntheticControl	6	16	3-2-1
Trace	4	16	3-2-1
TwoLeadECG	2	16	3-2-1
TwoPatterns	4	16	3-2-1
UWaveX	8	32	3-2-1
UWaveY	8	32	3-2-1
UWaveZ	8	32	3-2-1
Wafer	2	128	3-2-1
WordSynonyms	25	256	3-2-1
Yoga	2	256	3-2-1

DOI: 10.7717/peerj-cs.3460/table-2

Our experiments were conducted using an RTX 4090 GPU and an Intel Core I9-14900K processor, operating under Ubuntu 22.04 LTS. The experimental framework used the SpikingJelly library (version 0.0.0.0.15) (Fang et al., 2023) and PyTorch (version 2.0.1) (Paszke et al., 2019). The version of the graphics card driver is 535.86.05, with CUDA 12.1.

Ablation study

In this section, detailed ablation experiments are conducted to demonstrate the effectiveness of the proposed modules. Here, MT-SNN-TAIF ( $T_{3}$ - $T_{2}$ - $T_{1}$ ) represents the network using Masked Timestep and TAIF, with timesteps reduced from $T_{3}$ to $T_{2}$ to $T_{1}$ . MT-SNN-IF ( $T_{3}$ - $T_{2}$ - $T_{1}$ ) is the counterpart using IF neurons. SNN (T) denotes the standard SNN network without Masked Timestep and TAIF, with a fixed timestep T. The models mentioned above use the direct encoding method.

Three representative datasets were selected: Adiac, FordB, and DiatomSizeR, with $T = 1$ , $T = 3$ , $T = 5$ , and $T = 20$ to represent timesteps ranging from very short to very long. The results are shown in Table 3. MT-SNN was not tested at $T = 1$ and $T = 20$ because there are insufficient timesteps to apply reduction with only one timestep. When $T = 20$ , selecting appropriate timesteps becomes both complex and unnecessary, as most timesteps are redundant. Additionally, MT-SNN is specifically designed for low timestep settings, making tests at high timesteps misaligned with the intended design focus.

Table 3:

Ablation study results.

The values represent the accuracy (%) achieved by each model configuration. MT-SNN-TAIF uses the 3-2-1 and 5-3-1 configuration. Bold values indicate the best performance for each metric.

Dataset	Model	T = 1	T = 3	T = 5	T = 20
Adiac	SNN	82.81%	85.50%	85.55%	87.11%
	MT-SNN-IF	–	85.55%	87.50%	–
	MT-SNN-TAIF	–	85.94%	88.28%	–
FordB	SNN	82.37%	82.37%	82.75%	83.37%
	MT-SNN-IF	–	83.00%	84.25%	–
	MT-SNN-TAIF	–	83.75%	85.50%	–
DiatomSizeR	SNN	96.09%	96.05%	94.41%	96.88%
	MT-SNN-IF	–	95.72%	93.09%	–
	MT-SNN-TAIF	–	96.71%	98.68%	–

DOI: 10.7717/peerj-cs.3460/table-3

From Table 3, it can be observed that if an appropriate timestep is selected, direct-encoding SNNs can be highly competitive, demonstrating the simplicity and efficiency of direct encoding compared to rate encoding or latency encoding in classification tasks. This observation is consistent with the results in image classification tasks. However, the performance of direct encoding method is highly related to the timesteps. In the shortest timestep, the model exhibits the lowest accuracy. As the timestep increases, the accuracy improves, but there is a point of diminishing returns where excessively long timesteps can actually degrade performance and bring extra computation cost. This indicates that the baseline model, SNN, has considerable temporal redundancy and that selecting an appropriate timestep is crucial for optimal performance. Our proposed Masked Timestep method requires only an appropriate number of timesteps. As shown in the table, the timesteps for MT-SNN, demonstrating that only five maximum timesteps can surpass the performance of SNN at 20 timesteps. However, the use of the MT method is limited by IF neurons. In the DiatomSizeR dataset, the standalone MT method even leads to a performance decline. By introducing TAIF neurons, MT-SNN consistently outperforms SNN with longer timesteps, even at ultra-low timesteps. Figure 7 indicates that in the first 100 epochs, all models except SNN (T = 20) have converged, showing that direct encoding requires more time to converge in large timesteps and exhibits noticeable oscillations in the early stages of training. As the training progresses, MT-SNN-TAIF shows a smoother and higher test accuracy compared to its counterparts at T = 3 and T = 5. This indicates that the MT-SNN-TAIF architecture has a stronger learning capacity and benefits more from additional training epochs. These results strongly demonstrate the synergistic effect of the MT method with TAIF and verify its superiority in low timestep configurations.

Figure 7: The test accuracy on DiatomSizeR dataset over the first 100 epochs.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-7

Comparison with previous work

In this section, we evaluate our proposed method on a total of 49 UCR univariate time series datasets. Regarding the evaluation metrics, we adopt the Mean Per-Class Error (MPCE) proposed by Wang, Yan & Oates (2017) to evaluate model classification performance in multiple datasets. For a model $M = {m_{i}}$ , dataset pool $D = {d_{k}}$ , class labels $C = {c_{k}}$ , and error rates $E = {e_{k}}$ , we have:

(22) $P C E_{k} = \frac{e_{k}}{c_{k}}$

(23) $M P C E_{i} = \frac{1}{K} \sum P C E_{k} .$

Here, $k$ indexes datasets, and $i$ indexes models. MPCE reflects the expected error rate for a single class across all datasets. Considering the number of classes, MPCE is a robust baseline criterion. We also present the average arithmetic ranking, the average geometric ranking and the critical difference diagram between MT-SNN and other methods.

Comparison with non-SNN work

• Comparison with traditional methods:
For traditional methods, we selected seven representative methods. DTW refers to 1NN-DTW, which is a simple and effective distance-based method but has a high computational cost and is sensitive to noise. BOSS is a dictionary-based method that handles symbolic representations well but relies on parameter settings. COTE and BOSSVS are ensemble models that combine multiple classifiers while consuming many computational resources. PROP combines 11 classifiers based on various elastic distance measures. SE1 is also an ensemble model based on discriminative shapelets. TSBF is a feature-based method that captures local patterns effectively but is influenced by segmentation and feature extraction strategies. The overall evaluation results and the CD-diagram can be seen in Table 4 and Fig. 8, respectively. Compared to traditional methods, the MT-SNN, an end-to-end, from scratch trained approach that does not require extra feature engineering, achieved the best performance. Additionally, MT-SNN does not require extensive hyperparameter tuning and exhibts tolerance to noise and the ability to learn under constrained conditions.
• Comparison with DL-based methods: In DL-based methods, we adopt three baselines proposed by Wang, Yan & Oates (2017): the MLP, FCN, ResNet, and MCNN proposed by Cui, Chen & Chen (2016). The MLP consists of three fully connected layers with 500 neurons each, utilizing ReLU activation functions and dropout; the softmax function is employed for final classification. The FCN comprises three convolutional layers with 128, 256, and 128 filters, respectively. Each layer is followed by batch normalization (BN) and ReLU activation functions, and the global pooling layer aggregates the feature maps into a single vector. The output layer employs a softmax activation function for classification. The ResNet uses three stacked residual blocks with filters of 64, 128, and 128. The classifier consists of a global average pooling layer followed by a softmax function. The MCNN method uses downsampling, skip sampling, and sliding windows to preprocess the data for multi-scale settings, leading to significant preprocessing work and data augmentation. The overall evaluation results and the CD-diagram are presented in Table 5 and Fig. 9, respectively. Among the four models, the FCN achieves the best performance. The ResNet performs slightly worse due to its tendency to overfit. However, the MT-SNN achieves the best results with a structure similar to the ResNet, indicating that the masked timestep spiking residual block has a better generalization capacity, more reasonable biological plausibility and does not require extensive preprocessing and data augmentation used in MCNN.

Table 4:

Error rates of different traditional time series classification methods with wins, AVG arithmetic ranking, AVG geometric ranking, and MPCE.

Bold values indicate the best performance for each metric.

Err rate	DTW	COTE	BOSSVS	PROP	BOSS	SE1	TSBF	MT-SNN (Ours)
Adiac	0.396	0.233	0.302	0.353	0.220	0.373	0.245	0.117
Beef	0.367	0.133	0.267	0.367	0.200	0.133	0.287	0.125
CBF	0.003	0.001	0.001	0.002	0	0.010	0.009	0
ChlorineCon	0.352	0.314	0.345	0.360	0.340	0.312	0.336	0.196
CinCECGTorso	0.349	0.064	0.130	0.062	0.125	0.021	0.262	0.040
Coffee	0	0	0.036	0	0	0	0.004	0
CricketX	0.246	0.154	0.346	0.203	0.259	0.297	0.278	0.250
CricketY	0.256	0.167	0.328	0.156	0.208	0.326	0.259	0.185
CricketZ	0.246	0.128	0.313	0.156	0.246	0.277	0.263	0.177
DiatomSizeR	0.033	0.082	0.036	0.059	0.046	0.069	0.126	0.013
ECGFiveDays	0.232	0	0	0.178	0	0.055	0.183	0.065
FaceAll	0.192	0.105	0.241	0.152	0.210	0.247	0.234	0.174
FaceFour	0.170	0.091	0.034	0.091	0	0.034	0.051	0
FacesUCR	0.095	0.057	0.103	0.063	0.042	0.079	0.090	0.066
FiftyWords	0.310	0.191	0.367	0.18	0.301	0.288	0.209	0.184
Fish	0.177	0.029	0.017	0.034	0.011	0.057	0.080	0.038
GunPoint	0.093	0.007	0	0.007	0	0.060	0.011	0
Haptics	0.623	0.488	0.584	0.584	0.536	0.607	0.488	0.500
InlineSkate	0.616	0.551	0.573	0.567	0.511	0.653	0.603	0.720
ItalyPower	0.050	0.036	0.086	0.039	0.053	0.053	0.096	0.026
Lightning2	0.131	0.164	0.262	0.115	0.148	0.098	0.257	0.094
Lightning7	0.274	0.247	0.288	0.233	0.342	0.274	0.262	0.094
MALLAT	0.066	0.036	0.064	0.050	0.058	0.092	0.037	0.042
MedicalImages	0.263	0.258	0.474	0.245	0.288	0.305	0.269	0.228
MoteStrain	0.165	0.085	0.115	0.114	0.073	0.113	0.135	0.073
NonInvThorax1	0.210	0.093	0.169	0.178	0.161	0.174	0.138	0.098
NonInvThorax2	0.135	0.073	0.118	0.112	0.101	0.118	0.130	0.073
OliveOil	0.167	0.100	0.133	0.133	0.100	0.133	0.090	0.033
OSULeaf	0.409	0.145	0.074	0.194	0.012	0.273	0.329	0.102
SonyAIBO	0.275	0.146	0.265	0.293	0.321	0.238	0.175	0.018
SonyAIBOII	0.169	0.076	0.188	0.124	0.098	0.066	0.196	0.083
StarLightCurves	0.093	0.031	0.096	0.079	0.021	0.093	0.022	0.022
SwedishLeaf	0.208	0.046	0.141	0.085	0.072	0.120	0.075	0.082
Symbols	0.050	0.046	0.029	0.049	0.032	0.083	0.034	0.033
SyntheticControl	0.007	0	0.040	0.010	0.030	0.033	0.008	0
Trace	0	0.010	0	0.010	0	0.050	0.020	0
TwoLeadECG	0	0.015	0.015	0	0.004	0.029	0.001	0
TwoPatterns	0.096	0	0.001	0.067	0.016	0.048	0.046	0
UWaveX	0.272	0.196	0.270	0.199	0.241	0.248	0.164	0.196
UWaveY	0.366	0.267	0.364	0.283	0.313	0.322	0.249	0.272
UWaveZ	0.342	0.265	0.336	0.290	0.312	0.346	0.217	0.232
Wafer	0.020	0.001	0.001	0.003	0.001	0.002	0.004	0
WordSynonyms	0.351	0.266	0.439	0.226	0.345	0.357	0.302	0.315
Yoga	0.164	0.113	0.169	0.121	0.081	0.159	0.149	0.145
Win	3	12	4	5	13	3	4	21
AVG arithmetic ranking	6.14	2.68	5.52	4.18	3.50	5.59	4.98	2.23
AVG geometric ranking	5.48	2.25	4.76	3.63	2.80	4.98	4.34	1.83
MPCE	0.0397	0.0226	0.0330	0.0304	0.0256	0.0302	0.0335	0.0188

DOI: 10.7717/peerj-cs.3460/table-4

Figure 8: The critical difference diagram between MT-SNN and traditional methods.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-8

Table 5:

Error rates of different deep learning time series classification methods with wins, AVG arithmetic ranking, AVG geometric ranking, and MPCE.

Bold values indicate the best performance for each metric.

Err rate	MCNN (Cui, Chen & Chen, 2016)	MLP (Wang, Yan & Oates, 2017)	FCN (Wang, Yan & Oates, 2017)	ResNet (Wang, Yan & Oates, 2017)	MT-SNN (Ours)
Adiac	0.231	0.248	0.143	0.174	0.117
Beef	0.367	0.167	0.25	0.233	0.125
CBF	0.002	0.14	0	0.006	0
ChlorineCon	0.203	0.128	0.157	0.172	0.196
CinCECGTorso	0.058	0.158	0.187	0.229	0.040
Coffee	0.036	0	0	0	0
CricketX	0.182	0.431	0.185	0.179	0.250
CricketY	0.154	0.405	0.208	0.195	0.185
CricketZ	0.142	0.408	0.187	0.187	0.177
DiatomSizeR	0.023	0.036	0.07	0.069	0.013
ECGFiveDays	0	0.03	0.015	0.045	0.065
FaceAll	0.235	0.115	0.071	0.166	0.174
FaceFour	0	0.17	0.068	0.068	0
FacesUCR	0.063	0.185	0.052	0.042	0.066
FiftyWords	0.19	0.288	0.321	0.273	0.184
Fish	0.051	0.126	0.029	0.011	0.038
GunPoint	0	0.067	0	0.007	0
Haptics	0.53	0.539	0.449	0.495	0.50
InlineSkate	0.618	0.649	0.589	0.635	0.72
ItalyPower	0.03	0.034	0.03	0.04	0.026
Lightning2	0.164	0.279	0.197	0.246	0.094
Lightning7	0.219	0.356	0.137	0.164	0.094
MALLAT	0.057	0.064	0.02	0.021	0.042
MedicalImages	0.26	0.271	0.208	0.228	0.228
MoteStrain	0.079	0.131	0.05	0.105	0.073
NonInvThorax1	0.064	0.058	0.039	0.052	0.098
NonInvThorax2	0.06	0.057	0.045	0.049	0.073
OliveOil	0.133	0.60	0.167	0.133	0.033
OSULeaf	0.271	0.43	0.012	0.021	0.102
SonyAIBO	0.23	0.273	0.032	0.015	0.018
SonyAIBOII	0.07	0.161	0.038	0.038	0.083
StarLightCurves	0.023	0.043	0.033	0.029	0.022
SwedishLeaf	0.066	0.107	0.034	0.042	0.082
Symbols	0.049	0.147	0.038	0.128	0.033
SyntheticControl	0.003	0.05	0.01	0	0
Trace	0	0.18	0	0	0
TwoLeadECG	0.001	0.147	0	0	0
TwoPatterns	0.002	0.114	0.103	0	0
UWaveX	0.18	0.232	0.246	0.213	0.196
UWaveY	0.268	0.297	0.275	0.332	0.272
UWaveZ	0.232	0.295	0.271	0.245	0.232
Wafer	0.002	0.004	0.003	0.003	0
WordSynonyms	0.276	0.406	0.42	0.368	0.315
Yoga	0.112	0.145	0.155	0.142	0.145
Win	11	2	16	10	21
AVG arithmetic ranking	2.70	4.27	2.50	2.64	2.18
AVG geometric ranking	2.34	4.03	2.09	2.33	1.80
MPCE	0.0241	0.0407	0.0219	0.0231	0.0188

DOI: 10.7717/peerj-cs.3460/table-5

Figure 9: The critical difference diagram between MT-SNN and traditional DL methods.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-9

Comparison with TSC-SNN work

The proposed MT-SNN is compared with previous TSC-SNN work on 24 univariate datasets from the UCR Time Series Classification Archive. The results are presented in Table 6, which shows the comparative performance of the proposed MT-SNN model against other state-of-the-art TSC-SNN models across various datasets. The MT-SNN model surpasses other models in the majority of datasets, achieving the highest accuracy in 19 out of 24 datasets.

Table 6:

Accuracy comparison across different TSC-SNN methods.

A dash (–) denotes that the result is not provided. Bold values indicate the best performance for each metric.

Dataset	GTE (Dey et al., 2022)	LSNN (Gaurav, Stewart & Yi, 2023)	CSNN (Gautam & Singh, 2020)	MT-SNN (Ours)
Beef	–	–	86.66%	87.50%
CBF	–	–	100.00%	100.00%
CinCECGTorso	–	–	99.75%	95.96%
Coffee	–	–	100.00%	100.00%
DiatomSizeR	–	–	98.11%	98.68%
Earthquakes	71.94%	80.43%	–	81.25%
ECG200	–	–	92.00%	93.75%
ECG5000	–	98.49%	–	94.73%
FaceAll	–	–	75.38%	82.57%
FaceFour	–	–	96.59%	100.00%
FacesUCR	–	–	85.36%	93.45%
FordA	80.37%	93.56%	–	94.62%
FordB	64.32%	82.72%	–	85.50%
Gunpoint	–	–	100.00%	100.00%
ItalyPower	–	–	97.37%	97.36%
Lightning2	–	–	80.33%	90.62%
Lightning7	–	–	83.56%	90.62%
StarLightCurves	–	–	97.91%	97.79%
SyntheticControl	–	–	100.00%	100.00%
Trace	–	–	100.00%	100.00%
TwoPatterns	–	–	100.00%	100.00%
UWaveX	–	–	80.37%	80.43%
UWaveY	–	–	76.68%	72.80%
Wafer	98.85%	99.51%	100.00%	100.00%

DOI: 10.7717/peerj-cs.3460/table-6

Comparison with CSNN: The MT-SNN achieves significant improvements on several datasets. For example, MT-SNN shows remarkable advancements in the FacesUCR and Lightning2 datasets, with accuracies of 93.45% and 90.62%, compared to CSNN’s 85.36% and 80.33%, respectively. While CSNN reaches 100% accuracy on seven datasets, MT-SNN additionally achieves 100% on the FaceFour dataset. Despite CSNN achieving better results on datasets such as CinTorso and uWaveY, the advantage is insignificant. It is also worth noting that CSNN contains only a single layer of spiking neurons and still uses ReLU as activation functions, whereas MT-SNN replaces all ReLU with spiking neurons in feature extraction, encoding along with decoding, demonstrating superior spiking plausibility.
Comparison with GTE and LSNN: GTE and LSNN were exclusively tested on vibration time series datasets. Within these datasets, MT-SNN demonstrated significant advantages. The sole exception was the ECG5000 dataset, where LSNN achieved a higher accuracy compared to MT-SNN; however, the difference was not substantial. Overall, MT-SNN exhibited more comprehensive and outstanding performance compared to both GTE and LSNN.

In summary, the results demonstrate the consistent performance of MT-SNN across multiple datasets, highlighting its generalizability and effectiveness. Remarkably, MT-SNN achieves these results without relying on specialized encoding schemes or pre-feature engineering techniques, using a very low timestep of 5 or 3. This underscores the inherent capability of SNNs to capture discriminative features directly from original time series data.

Visualization of spike encoder

To visualize the original one-dimensional data, the Lighting2 dataset is selected. The spike sequences of the feature maps extracted by the encoder in five time steps are plotted. For ease of visualization, the encoding of the first five channels is displayed, illustrating encoding effects at different timesteps as shown in Figs. 10, 11, 12.

Figure 10: Due to its decay property, the LIF encoder fires spikes that are more dispersed over time compared to the IF and TAIF encoders.
However, this sparse representation compromises temporal precision by missing the rise in the original time series at the length of 400.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-10

Figure 11: The IF encoder, functioning as an ideal integrator, generates a higher density spike representation.
However, this also results in lower robustness. In the final stage (t = 4), an abnormal spike, which should not have been fired, appeared at the end of the sequence.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-11

Figure 12: The TAIF encoder is similar to the IF encoder, but its spikes are more evenly distributed over time and more compact than those of the LIF encoder.
Due to its temporal adaptive feature, the TAIF encoder effectively captures the rise without producing abnormal spikes.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-12

Observations illustrate that, in the early timesteps, the features extracted by the spike encoder are relatively sparse. However, as the timesteps progress, more features are activated, indicating that the TAIF encoder is capable of gradually extracting and accumulating key information from the input signals over time, forming an efficient spike encoding representation. This approach avoids the non-firing issue of the LIF encoder and the misfiring issue of the IF encoder.

In addition, Fig. 13 illustrates the comparison between IF and TAIF encoders in a single channel. Through a visual analysis of feature mappings across different timesteps and neurons, the effectiveness of the TAIF spike encoder in encoding time series data becomes evident. The overlay results demonstrate the overall encoding performance, while the feature mappings of individual channels reveal fine-grained encoding characteristics. This section includes the raw data, feature mappings at various timesteps, feature mappings of individual neurons, providing a comprehensive evaluation of the encoding performance of different spike encoders.

Figure 13: Comprasion of IF and TAIF encoder in single channel.
The TAIF encoder provides smoother encoding compared to the IF encoder, especially noticeable in less sensitive channel, maintaining a consistent and adaptive spiking pattern.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-13

Noise tolerance in masked timesteps

In this section, we conduct an experiment to evaluate robustness under varying levels of Gaussian noise. By introducing Gaussian noise with different standard deviations to the original time series data, we aim to analyze the impact on the performance of IF and TAIF under Masked Timestep.

The datasets used for this analysis include the CBF, Coffee and Gunpoint datasets, each subjected to three different levels of Gaussian noise: 0.5, 1.0, and 1.5.

The performance metrics for each model under different noise conditions are summarized in Fig. 14.

Figure 14: Evaluation of three datasets of noise tolerance of IF and TAIF in MT-SNN.

Download full-size image

DOI: 10.7717/peerj-cs.3460/fig-14

In the MT-SNN architecture, both models exhibit varying degrees of robustness to different levels of Gaussian noise. In the CBF and Coffee datasets, both models have a certain robustness, maintaining high accuracy across all noise levels. However, in the Gunpoint dataset, the TAIF model consistently outperforms the IF model across all noise levels. This dataset contains short sequences with localized temporal patterns, making it highly sensitive to noise. By introducing temporal adaptivity, TAIF achieves a more stable response to critical timesteps and thus preserves discriminative features more effectively, leading to greater robustness under Gaussian noise.

Training with restricted data subset

In this section, to simulate edge scenarios with limited device or training resources, real-time requirements, and costly annotation, the performance of the MT-SNN is evaluated on a restricted data subset. Unsolved binary classification tasks, which better reflect real-world situations, are selected. Their training sets are reduced to 10% of their original size, ensuring that each class has at least one sample, while the test sets remain unchanged.

To provide a thorough comparison, we also present the performance of Spiking Reservoir method using Gaussian Temporal Encoding (GTE) as described in Dey et al. (2022). Importantly, this method is trained on 20% of the training data. This comparison allows us to assess the relative performance and robustness of the MT-SNN under harsher constraints. The detailed settings and performance comparisons are provided in Tables 7 and 8.

Table 7:

Total samples and training samples for GTE and MT-SNN methods.

Datasets	Samples	GTE (20%)	MT-SNN (10%)
Earthquakes	322	64	32
ECG200	100	–	10
ECGFiveDays	23	–	2
FordA	3,601	720	360
FordB	3,636	727	362
ItalyPower	67	–	6
Lightning2	60	–	6
MoteStrain	20	–	2
SonyAIBO	20	–	2
SonyAIBOII	27	–	2
Yoga	300	–	30

DOI: 10.7717/peerj-cs.3460/table-7

Table 8:

Comparison of GTE and MT-SNN accuracies for binary classification datasets.

Bold values indicate the best performance for each metric.

Datasets	GTE	MT-SNN (10%)	MT-SNN (100%)
Earthquakes	63.64%	80.47%	81.25%
ECG200	–	85.42%	93.75%
ECGFiveDays	–	70.74%	93.35%
FordA	74.27%	92.54%	94.62%
FordB	59.57%	78.12%	85.50%
ItalyPower	–	86.80%	97.74%
Lightning2	–	78.33%	90.62%
MoteStrain	–	89.42%	92.27%
SonyAIBO	–	67.71%	98.17%
SonyAIBOII	–	79.31%	91.70%
Yoga	–	60.99%	85.50%

DOI: 10.7717/peerj-cs.3460/table-8

The adaptability of the MT-SNN using direct encoding enables it to perform significantly better than the GTE using temporal encoding, even with only half the amount of training data. Despite the reduced size of the training set, the MT-SNN reaches reasonably high accuracy on several datasets with limited training samples. For datasets with a large number of original training samples, such as Earthquake, FordA and FordB, the performance of MT-SNN in a restricted setting is almost comparable to that of the full dataset, outperforming GTE under more challenging conditions. Additionally, for datasets with only a few training samples, MT-SNN still achieves acceptable performance, even reaching levels comparable to those with the full dataset in cases like MoteStrain and Lightning2. The lower results in Yoga and SonyAIBO are probably due to the inadequate representation in the 10% training data for the spike encoder. However, considering the reduction in sample size, this decrease is acceptable. In conclusion, the enhanced performance of MT-SNN under more constrained conditions highlights its robustness and adaptability compared to the Spiking Reservoir method.

Conclusion

In this article, we propose the MT-SNN architecture to address the issue of temporal redundancy in TSC-SNN when training from scratch. As a purely end-to-end model, MT-SNN employs a simple and effective direct encoding method, eliminating the need for pre-trained encoders and data augmentation while achieving SoTA performance in univariate TSC tasks with stronger generalization capabilities. The masked timestep method improves efficiency and reduces computational cost by selectively masking less informative timesteps, thereby focusing computational resources on the most significant features. Furthermore, the introduction of TAIF neurons provides MT-SNN with greater noise tolerance, allowing it to maintain robust performance even in the presence of exsiting noisy input data.

Compared to RC methods, MT-SNN demonstrates enhanced learning capabilities with smaller datasets, making it particularly suitable for edge scenarios where computational resources and data availability are limited. The architecture design inherently supports efficient implementation on edge devices, offering significant advantages in low-latency and real-time processing environments. In future work, we plan to deploy MT-SNN on specialized hardware, such as neuromorphic chips, to fully leverage its computational efficiency and low power consumption.

Further, we aim to refine the encoding method to further enhance the performance and adaptability of the models. We will also extend the application of the masked timestep method to multivariate TSC tasks, thereby broadening the scope and utility of MT-SNN across diverse real-world applications. At present, the evaluation is limited to benchmark univariate datasets, and its effectiveness in multivariate and real-world noisy environments remains to be validated. Moreover, we intend to optimize the MT method to enable automatic adjustment of the reduction of timesteps, replacing the current reliance on manual settings. This optimization will involve developing adaptive algorithms that can dynamically determine the most appropriate timestep reduction strategy based on the property of the input data.

Supplemental Information

Source code.

DOI: 10.7717/peerj-cs.3460/supp-1

Download

Translation codebook.

DOI: 10.7717/peerj-cs.3460/supp-2

Download

Test subset of the Adiac dataset including time-series measurements and corresponding class labels.

DOI: 10.7717/peerj-cs.3460/supp-3

Download

Training subset of the Adiac dataset used to train the proposed classification model.

DOI: 10.7717/peerj-cs.3460/supp-4

Download

[1] Abbott LF. 1999. Lapicque’s introduction of the integrate-and-fire model neuron (1907) Brain Research Bulletin 50(5–6):303-304

[2] Akopyan F, Sawada J, Cassidy A, Alvarez-Icaza R, Arthur J, Merolla P, Imam N, Nakamura Y, Datta P, Nam G-J, Taba B, Beakes M, Brezzo B, Kuang JB, Manohar R, Risk WP, Jackson B, Modha DS. 2015. TrueNorth: design and tool flow of a 65 mw 1 million neuron programmable neurosynaptic chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 34(10):1537-1557

[3] Bagnall A, Lines J, Bostrom A, Large J, Keogh E. 2017. The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Mining and Knowledge Discovery 31(3):606-660

[4] Bagnall A, Lines J, Hills J, Bostrom A. 2015. Time-series classification with cote: the collective of transformation-based ensembles. IEEE Transactions on Knowledge and Data Engineering 27(9):2522-2535

[5] Baydogan MG, Runger G, Tuv E. 2013. A bag-of-features framework to classify time series. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(11):2796-2802

[6] Chen HT, Ng KT, Bermak A, Law MK, Martinez D. 2011. Spike latency coding in biologically inspired microelectronic nose. IEEE Transactions on Biomedical Circuits and Systems 5(2):160-168

[7] Chowdhury SS, Garg I, Roy K. 2021a. Spatio-temporal pruning and quantization for low-latency spiking neural networks.

[8] Chowdhury SS, Rathi N, Roy K. 2021b. One timestep is all you need: training spiking neural networks with ultra low latency. ArXiv

[9] Cui Z, Chen W, Chen Y. 2016. Multi-scale convolutional neural networks for time series classification. ArXiv

[10] Davies M, Srinivasa N, Lin T-H, Chinya G, Cao Y, Choday SH, Dimou G, Joshi P, Imam N, Jain S, Liao Y, Lin C-K, Lines A, Liu R, Mathaikutty D, McCoy S, Paul A, Tse J, Venkataramanan G, Weng Y-H, Wild A, Yang Y, Wang H. 2018. Loihi: a neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1):82-99

[11] Deng L, Wu Y, Hu X, Liang L, Ding Y, Li G, Zhao G, Li P, Xie Y. 2020. Rethinking the performance comparison between snns and anns. Neural Networks 121(10):294-307

[12] Dey S, Banerjee D, George AM, Mukherjee A, Pal A. 2022. Efficient time series classification using spiking reservoir.

[13] Ding Y, Zuo L, Jing M, He P, Xiao Y. 2024. Shrinking your timestep: towards low-latency neuromorphic object recognition with spiking neural network. ArXiv

[14] Dong Y, Zhao D, Li Y, Zeng Y. 2023. An unsupervised STDP-based spiking neural network inspired by biologically plausible learning rules and connections. Neural Networks 165(7010):799-808

[15] Faghihi F, Cai S, Moustafa AA. 2022. A neuroscience-inspired spiking neural network for eeg-based auditory spatial attention detection. Neural Networks 152(5):555-565

[16] Fang W, Chen Y, Ding J, Yu Z, Masquelier T, Chen D, Huang L, Zhou H, Li G, Tian Y. 2023. SpikingJelly: an open-source machine learning infrastructure platform for spike-based intelligence. Science Advances 9(40):eadi1480

[17] Fang H, Shrestha A, Qiu Q. 2020. Multivariate time series classification using spiking neural networks.

[18] Fang W, Yu Z, Chen Y, Huang T, Masquelier T, Tian Y. 2021. Deep residual learning in spiking neural networks. Advances in Neural Information Processing Systems 34:21056-21069

[19] Fawaz HI. 2020. Deep learning for time series classification.

[20] Gaurav R, Stewart TC, Yi Y. 2023. Reservoir based spiking models for univariate time series classification. Frontiers in Computational Neuroscience 17:1148284

[21] Gautam A, Singh V. 2020. CLR-based deep convolutional spiking neural network with validation based stopping for time series classification. Applied Intelligence 50(3):830-848

[22] Gong P, Wang P, Zhou Y, Zhang D. 2023. A spiking neural network with adaptive graph convolution and LSTM for EEG-based brain-computer interfaces. IEEE Transactions on Neural Systems and Rehabilitation Engineering 31(11):1440-1450

[23] Han J, Xiao N, Yang W, Luo S, Zhao J, Qiang Y, Chaudhary S, Zhao J. 2022. MS-ResNet: disease-specific survival prediction using longitudinal CT images and clinical data. International Journal of Computer Assisted Radiology and Surgery 17(6):1049-1057

[24] Hu Y, Deng L, Wu Y, Yao M, Li G. 2021. Advancing spiking neural networks towards deep residual learning. ArXiv

[25] Keogh E, Ratanamahatana CA. 2005. Exact indexing of dynamic time warping. Knowledge and Information Systems 7(3):358-386

[26] Kim J, Kim H, Huh S, Lee J, Choi K. 2018. Deep neural networks with weighted spikes. Neurocomputing 311(6):373-386

[27] Kim Y, Li Y, Park H, Venkatesha Y, Hambitzer A, Panda P. 2023. Exploring temporal information dynamics in spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 37(7):8308-8316

[28] Kim Y, Park H, Moitra A, Bhattacharjee A, Venkatesha Y, Panda P. 2022. Rate coding or direct coding: which one is better for accurate, robust, and energy-efficient spiking neural networks?

[29] Kingma DP, Ba J. 2017. Adam: a method for stochastic optimization.

[30] Lee JH, Delbruck T, Pfeiffer M. 2016. Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10:508

[31] Li Y, Geller T, Kim Y, Panda P. 2024. SEENN: towards temporal spiking early exit neural networks. Advances in Neural Information Processing Systems 36:63327-63342

[32] Li C, Jones EG, Furber S. 2023. Unleashing the potential of spiking neural networks with dynamic confidence.

[33] Li Y, Moitra A, Geller T, Panda P. 2023c. Input-aware dynamic timestep spiking neural networks for efficient in-memory computing.

[34] Li H, Wang Z, Yue X, Wang W, Tomiyama H, Meng L. 2023a. An architecture-level analysis on deep learning models for low-impact computations. Artificial Intelligence Review 56(3):1971-2010

[35] Li H, Yue X, Zhao C, Meng L. 2023b. Lightweight deep neural network from scratch. Applied Intelligence 53(15):18868-18886

[36] Maass W. 1997. Networks of spiking neurons: the third generation of neural network models. Neural Networks 10(9):1659-1671

[37] Mayr C, Hoeppner S, Furber S. 2019. Spinnaker 2: a 10 million core processor system for brain simulation and machine learning. ArXiv

[38] Neftci EO, Mostafa H, Zenke F. 2019. Surrogate gradient learning in spiking neural networks: bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine 36(6):51-63

[39] Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L, Desmaison A, Kopf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J, Chintala S. 2019. Pytorch: an imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems 32:8024-8035

[40] Peng H, Xiong X, Wu M, Wang J, Yang Q, Orellana-Martín D, Pérez-Jiménez MJ. 2024. Reservoir computing models based on spiking neural P systems for time series classification. Neural Networks 169(9):274-281

[41] Rathi N, Roy K. 2021. DIET-SNN: a low-latency spiking neural network with direct input encoding and leakage and threshold optimization. IEEE Transactions on Neural Networks and Learning Systems 34(6):3174-3182

[42] Ren J, Li H, Wang A, Saho K, Meng L. 2024. Radar-based gait analysis by transformer-liked network for dementia diagnosis. Biomedical Signal Processing and Control 91(7):105986

[43] Roy K, Jaiswal A, Panda P. 2019. Towards spike-based machine intelligence with neuromorphic computing. Nature 575(7784):607-617

[44] Rueckauer B, Lungu I-A, Hu Y, Pfeiffer M, Liu S-C. 2017. Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11:682

[45] Sam A, Boostani R, Hashempour S, Taghavi M, Sanei S. 2023. Depression identification using EEG signals via a hybrid of LSTM and spiking neural networks. IEEE Transactions on Neural Systems and Rehabilitation Engineering 31(8):4725-4737

[46] Schäfer P. 2015. The boss is concerned with time series classification in the presence of noise. Data Mining and Knowledge Discovery 29(6):1505-1530

[47] Sengupta A, Ye Y, Wang R, Liu C, Roy K. 2019. Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13:95

[48] Su Q, Chou Y, Hu Y, Li J, Mei S, Zhang Z, Li G. 2023. Deep directly-trained spiking neural networks for object detection.

[49] Suetake K, Ikegawa SI, Saiin R, Sawada Y. 2023. S3NN: time step reduction of spiking surrogate gradients for training energy efficient single-step spiking neural networks. Neural Networks 159(10):208-219

[50] Thompson NC, Greenewald K, Lee K, Manso GF. 2021. Deep learning’s diminishing returns: the cost of improvement is becoming unsustainable. IEEE Spectrum 58(10):50-55

[51] Wang X, Lin X, Dang X. 2020. Supervised learning in spiking neural networks: a review of algorithms and evaluations. Neural Networks 125(2):258-280

[52] Wang Z, Yan W, Oates T. 2017. Time series classification from scratch with deep neural networks: a strong baseline.

[53] Wu Y, Deng L, Li G, Zhu J, Xie Y, Shi L. 2019. Direct training for spiking neural networks: faster, larger, better. Proceedings of the AAAI Conference on Artificial Intelligence 33(01):1311-1318

[54] Yan Z, Zhou J, Wong W-F. 2021. Energy efficient ECG classification with spiking neural network. Biomedical Signal Processing and Control 63(11):102170

[55] Yang S, Linares-Barranco B, Chen B. 2022. Heterogeneous ensemble-based spike-driven few-shot online learning. Frontiers in Neuroscience 16:850932

[56] Zhang Y, Tiňo P, Leonardis A, Tang K. 2021. A survey on neural network interpretability. IEEE Transactions on Emerging Topics in Computational Intelligence 5(5):726-742

[57] Zhang A, Zhu W, Li J. 2018. Spiking echo state convolutional neural network for robust time series classification. IEEE Access 7:4927-4935

[58] Zheng H, Wu Y, Deng L, Hu Y, Li G. 2021. Going deeper with directly-trained larger spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 35(12):11062-11070

Masked timesteps in spiking neural networks for efficient time series classification

Introduction

Related work

Spiking neural networks for time series classification

Figure 1: Current methods in TSC-SNNs.

Temporal redundancy in spiking neural networks

Method

Network structure

Encoding methods

Figure 2: The direct encoding methods for TSC: to reduce the randomness of single neuron firing, direct encoding needs to create an extra time dimension by repeating the original data along the time axis.

Spiking residual block

Structure of Spiking ResNet

Figure 3: The general structure of the basic residual SNN, obtained by replacing the ReLU with spiking neurons.

Masked timesteps

Univariate TSC modeling

Spiking neuron integration

Temporal selection

Figure 4: The overview of masked timestep in Spiking Residual Blocks, the more detailed processing is in Algorithm 1.

TAIF neuron

Overview of spiking neurons

Training strategy

Figure 5: The surrogate gradient method.

Figure 6: The computation graph of TAIF in time dimension, the dotted line represents the situation when the neuron is not firing.

Experiments

Parameter settings

Ablation study

Figure 7: The test accuracy on DiatomSizeR dataset over the first 100 epochs.

Comparison with previous work

Comparison with non-SNN work

Figure 8: The critical difference diagram between MT-SNN and traditional methods.

Figure 9: The critical difference diagram between MT-SNN and traditional DL methods.

Comparison with TSC-SNN work

Visualization of spike encoder

Figure 10: Due to its decay property, the LIF encoder fires spikes that are more dispersed over time compared to the IF and TAIF encoders.

Figure 11: The IF encoder, functioning as an ideal integrator, generates a higher density spike representation.

Figure 12: The TAIF encoder is similar to the IF encoder, but its spikes are more evenly distributed over time and more compact than those of the LIF encoder.

Figure 13: Comprasion of IF and TAIF encoder in single channel.

Noise tolerance in masked timesteps

Figure 14: Evaluation of three datasets of noise tolerance of IF and TAIF in MT-SNN.

Training with restricted data subset

Conclusion

Supplemental Information

Source code.

Translation codebook.

Test subset of the Adiac dataset including time-series measurements and corresponding class labels.

Training subset of the Adiac dataset used to train the proposed classification model.

Download article

Report a problem

Follow this publication for updates

Change notification settings or unfollow

Share this publication

Metrics

Links

Articles citing this paper