Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

New Efficient Pruning Algorithms for Compressing and Accelerating Convolutional Neural Networks

Gao, Shangqian (2024) New Efficient Pruning Algorithms for Compressing and Accelerating Convolutional Neural Networks. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

[img]
Preview
PDF
Download (7MB) | Preview

Abstract

Recently, Convolutional Neural Networks (CNNs) are continuously achieving state-of-the-art results in numerous machine-learning tasks. While having impressive performance, the size of current models is also exploding. Motivated by efficient inference, many researchers have been devoted to reducing the storage and computational costs of state-of-the-art models. Channel pruning emerges as a promising solution to reduce the size of the model, and it can achieve acceleration without any post-processing steps. Current channel pruning methods are either time-consuming (reinforcement learning, greedy search, etc.) or depend on fixed criteria of channels resulting in poor results.

In this dissertation work, we propose new methods from the perspective of gradient-guided pruning. We then formulate the pruning problem as a constrained discrete optimization problem. Our discrete model compression work aims to solve this constrained problem by using differentiable gates and propagating gradients through a straight-through estimator. We further improve the results in network pruning via performance maximization by adding a performance prediction loss into the constrained optimization problem. The search for sub-networks is then directly guided by the accuracy of a sub-network. The improvement of supervision leads to better pruning results.

On top of previous works, we propose to further improve our algorithms from different perspectives. The first perspective is to disentangle width and importance for finding the optimal model architecture. From this end, we propose to use an importance generation network and a width generation network to generate the importance and width for each layer. Another challenge in previous works is the huge gap between the model before and after network pruning. To mitigate this gap, we first learn a target sub-network during the model training process, and then we use this sub-network to guide the learning of model weights through partial regularization. Based on the success of previous static pruning methods, we further incorporate dynamic pruning for storage-efficient dynamic pruning.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Gao, Shangqianshg84@pitt.edushg84
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairHuang, Hengheng@umd.edu
Committee MemberLiang, Zhanliz119@pitt.edu
Committee MemberWei, Gaoweigao@pitt.edu
Committee MemberMao, Zhi-Hongzhm4@pitt.edu
Committee MemberChen, Weiwec47@pitt.edu
Date: 3 June 2024
Date Type: Publication
Defense Date: 29 November 2023
Approval Date: 3 June 2024
Submission Date: 9 February 2024
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 115
Institution: University of Pittsburgh
Schools and Programs: Swanson School of Engineering > Electrical and Computer Engineering
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Convolutional Neural Networks, Channel Pruning, Differentiable Pruning
Date Deposited: 03 Jun 2024 14:37
Last Modified: 03 Jun 2024 14:37
URI: http://d-scholarship.pitt.edu/id/eprint/45795

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item