Wu, Xidong
(2024)
Convex and Non-convex Model Compression for Large-Scale Model Training.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
Recently, machine learning (ML) models have gained extensive utilization across a variety of applications. However, unlike traditional methods, ML models, especially those employing deep learning, utilize their significant depth and intricate architecture to improve approximation capabilities, posing a challenge to the hardware capabilities of devices during deployment and communication during training. This challenge is particularly pronounced when deploying ML models on edge devices, which are characterized by limited storage and modest processing capabilities. To address these issues, the concept of model compression has emerged as an approach to reduce the size of ML models with minimal performance degradation and facilitate deployment. Several model compression techniques, such as weight pruning, knowledge distillation, and model screening, have been explored. Additionally, the training process for these models requires substantial data, and distributed/federated training serves as a solution to overcome data-related obstacles. The objective of this dissertation is to enhance the efficiency of convex and non-convex models, with or without multi-party collaborative training (distributed and federated learning).
We develop various approaches for compressing logistic classifiers (convex models) and deep learning models (nonconvex models). In task 1, we introduce a novel distributed dynamic safe screening framework for generalized sparse convex models. This framework reduces the model dimension in advance compared to traditional lasso techniques, accelerating distributed training and reducing communication overhead by discarding inactive features with zero coefficients. In task 2, we concentrate on the application of foundation models in federated learning. Foundation models exhibit outstanding performance and the ability to mitigate the impact of heterogeneous data distributions. We explore compressing foundation models to improve performance on edge devices. In task 3, we delve into structural pruning in centralized learning. We propose a new algorithm that employs the controller network to guide end-to-end model pruning without relying on additional fine-tuning procedures after removing redundant structures. Comprehensive experiments conducted on a large scale within distributed or centralized settings validate the rationale and efficacy of our proposed methodology. Finally, we provide related theoretical analysis to ensure the convergence of proposed algorithms.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
3 June 2024 |
Date Type: |
Publication |
Defense Date: |
1 April 2024 |
Approval Date: |
3 June 2024 |
Submission Date: |
2 April 2024 |
Access Restriction: |
2 year -- Restrict access to University of Pittsburgh for a period of 2 years. |
Number of Pages: |
113 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
Swanson School of Engineering > Electrical and Computer Engineering |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
Model Compression, Sparse Learning, Safe Screening, Distributed Training, Model Pruning, Model Distillation |
Date Deposited: |
03 Jun 2024 14:41 |
Last Modified: |
03 Jun 2024 14:41 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/45974 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
 |
View Item |