New Efficient Pruning Algorithms for Compressing and Accelerating Convolutional Neural Networks

Gao, Shangqian (2024) New Efficient Pruning Algorithms for Compressing and Accelerating Convolutional Neural Networks. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Preview

PDF
Download (7MB) | Preview

Abstract

Recently, Convolutional Neural Networks (CNNs) are continuously achieving state-of-the-art results in numerous machine-learning tasks. While having impressive performance, the size of current models is also exploding. Motivated by efficient inference, many researchers have been devoted to reducing the storage and computational costs of state-of-the-art models. Channel pruning emerges as a promising solution to reduce the size of the model, and it can achieve acceleration without any post-processing steps. Current channel pruning methods are either time-consuming (reinforcement learning, greedy search, etc.) or depend on fixed criteria of channels resulting in poor results.

In this dissertation work, we propose new methods from the perspective of gradient-guided pruning. We then formulate the pruning problem as a constrained discrete optimization problem. Our discrete model compression work aims to solve this constrained problem by using differentiable gates and propagating gradients through a straight-through estimator. We further improve the results in network pruning via performance maximization by adding a performance prediction loss into the constrained optimization problem. The search for sub-networks is then directly guided by the accuracy of a sub-network. The improvement of supervision leads to better pruning results.

On top of previous works, we propose to further improve our algorithms from different perspectives. The first perspective is to disentangle width and importance for finding the optimal model architecture. From this end, we propose to use an importance generation network and a width generation network to generate the importance and width for each layer. Another challenge in previous works is the huge gap between the model before and after network pruning. To mitigate this gap, we first learn a target sub-network during the model training process, and then we use this sub-network to guide the learning of model weights through partial regularization. Based on the success of previous static pruning methods, we further incorporate dynamic pruning for storage-efficient dynamic pruning.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Gao, Shangqian	shg84@pitt.edu	shg84

ETD Committee:

Title	Member	Email Address
Committee Chair	Huang, Heng	heng@umd.edu
Committee Member	Liang, Zhan	liz119@pitt.edu
Committee Member	Wei, Gao	weigao@pitt.edu
Committee Member	Mao, Zhi-Hong	zhm4@pitt.edu
Committee Member	Chen, Wei	wec47@pitt.edu

Date:

3 June 2024

Date Type:

Publication

Defense Date:

29 November 2023

Approval Date:

3 June 2024

Submission Date:

9 February 2024

Access Restriction:

No restriction; Release the ETD for access worldwide immediately.

Number of Pages:

115

Institution:

University of Pittsburgh

Schools and Programs:

Swanson School of Engineering > Electrical and Computer Engineering

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

Convolutional Neural Networks, Channel Pruning, Differentiable Pruning

Date Deposited:

03 Jun 2024 14:37

Last Modified:

03 Jun 2024 14:37

URI:

http://d-scholarship.pitt.edu/id/eprint/45795

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

New Efficient Pruning Algorithms for Compressing and Accelerating Convolutional Neural Networks

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds