Random pruning: channel sparsity by expectation scaling factor

View article
PeerJ Computer Science

Main article text

 

Introduction

  • 1) Based on extensive statistical validation, it is demonstrated that for any data sample, there is always a linear relationship between the sum of matrix elements of the channels and δE. The focus of this article on channel properties shifts from the norm to a change in distribution expectations.

  • 2) Based on the linear relationship between the sum of matrix elements of channels and δE, this article provides guidance for removing redundant channels to produce non-redundant and non-unique sub-networks. The advantages of the EXP in terms of compression and acceleration are demonstrated through extensive experiments and comparison with a variety of advanced methods.

The proposed method

Linear relations

where Ψ() denotes the summation of elements and there is a linear relationship between δE and Ψ(w). In visual inspection, different convolution kernels implement a variety of operations on the image. Convolutional operations achieve a scaling of the distributional expectation. The distribution expectation reflects the overall information of the feature mapping, so the expectation scaling factor δE works as a basic feature for the channel to reflect the ability to extract features.

Selecting redundant channels

where, Pm denotes the m-th filter, 1mM. Assuming that the class of basic features (non-redundant features) present in the filter Pm is NBFm, then pruning is to remove the redundant channels of each basic feature in Pm. In the model, a similarity evaluation function Sindex() is required for the judgment of similarity Ψ(w), so as to effectively characterize the redundancy of a certain set of channels D as follows.

where di is the i-th channel in the channel set D; round() represents rounding calculation; N plays the role of regulating the granularity of similarity, the larger N is, the more similar classes ( basic feature types NBF) are divided in the set D, the stricter the conditions for judging similarity. Equation (4) essentially divides channels into N + 1 classes, and one class represents one basic feature, so the number of basic feature types contained in D is NBFD=N+1.

Experiment

Experimental settings

Results and analysis

Results of CIFAR-10

Results of ImageNet

Discussion

Non-uniqueness and stability of sub-networks

Non-uniqueness

Stability

Generalization impact

Conclusion

Additional Information and Declarations

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Chuanmeng Sun conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the article, and approved the final draft.

Jiaxin Chen conceived and designed the experiments, performed the experiments, performed the computation work, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Yong Li conceived and designed the experiments, analyzed the data, performed the computation work, authored or reviewed drafts of the article, and approved the final draft.

Wenbo Wang conceived and designed the experiments, performed the experiments, performed the computation work, prepared figures and/or tables, and approved the final draft.

Tiehua Ma conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the article, and approved the final draft.

Data Availability

The following information was supplied regarding data availability:

The code is available at GitHub and Zenodo:

- https://github.com/EXP-Pruning/EXP_Pruning.

- EXP. (2023). EXP-Pruning/EXP_Pruning: v1.0.0 (python). Zenodo. https://doi.org/10.5281/zenodo.8141065.

The data is available at The CIFAR-10 dataset.

(http://www.cs.toronto.edu/~kriz/cifar.html) and ImageNet Large Scale Visual Recognition Challenge 2012 (ILSVRC2012).

(https://www.image-net.org/challenges/LSVRC/2012/index.php).

Funding

This work was supported by the National Key Research and Development Program of China (2022YFC2905700), the National Key Research and Development Program of China (2022YFB3205800), and the Fundamental Research Programs of Shanxi Province (202103021224199, 202203021221106). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

1 Citation 761 Views 43 Downloads