Evaluation of Parameter-Scaling for Efficient Deep Learning on Small Satellites

Gealy, Calvin (2022) Evaluation of Parameter-Scaling for Efficient Deep Learning on Small Satellites. Master's Thesis, University of Pittsburgh. (Unpublished)

Preview

PDF
Primary Text
Download (257kB) | Preview

Abstract

Parameter-scaling techniques change the number of parameters in a machine-learning model in an effort to make the network more amenable to different device types or accuracy requirements. This research compares the performance of two such techniques. NeuralScale is a neural architecture search method which claims to generate deep neural networks for devices that are resource-constrained. It shrinks a network to a target number of parameters by adjusting the width of layers independently to achieve a higher accuracy than previous methods. The novel NeuralScale algorithm is compared to the baseline uniform scaling of MobileNet-style models, where the width of each layer in the model is scaled uniformly across the network. Measurements of the latency and runtime memory required for inference were gathered on the NVIDIA Jetson TX2 and Jetson AGX Xavier embedded GPUs using NVIDIA TensorRT. Measurements were also gathered on the Raspberry Pi 4 embedded CPU featuring ARM Cortex-A72 cores using ONNX Runtime. VGG-11, MobileNetV2, Pre-Activation ResNet-18, and ResNet-50 were all scaled to 0.25×, 0.50×, 0.75×, and 1.00× the original number of parameters. On embedded GPUs, this research finds that NeuralScale models do offer higher accuracy, but they run slower and consume much more runtime memory during inference than their equivalent uniform-scaling models. On average, NeuralScale is 40% as efficient as uniform scaling in terms of accuracy per megabyte of runtime memory, and NeuralScale uses 2.7× the runtime memory per parameter as uniform scaling. On the embedded CPU, NeuralScale is slightly more efficient than uniform scaling in terms of accuracy per megabyte of memory, using essentially the same amount of memory per parameter. However, there is on average an over 2.5× increase in the latency for inference. Importantly, parameter count does not guarantee performance in terms of runtime-memory usage between the scaling methods on embedded GPUs, while latency grows significantly on embedded CPUs.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Gealy, Calvin	c.gealy@pitt.edu	cag158	0000-0003-1173-0378

ETD Committee:

Title	Member	Email Address	Pitt Username	ORCID
Thesis Advisor	George, Alan	alan.george@pitt.edu	adg91
Committee Member	Abdelhakim, Mai	maia@pitt.edu	MAIA	0000-0001-8442-0974
Committee Member	Hu, Jingtong	JTHU@pitt.edu	jthu	0000-0003-4029-4034

Date:

10 June 2022

Date Type:

Publication

Defense Date:

18 March 2022

Approval Date:

10 June 2022

Submission Date:

3 March 2022

Access Restriction:

2 year -- Restrict access to University of Pittsburgh for a period of 2 years.

Number of Pages:

Institution:

University of Pittsburgh

Schools and Programs:

Swanson School of Engineering > Electrical and Computer Engineering

Degree:

MS - Master of Science

Thesis Type:

Master's Thesis

Refereed:

Yes

Uncontrolled Keywords:

machine learning, computer vision, embedded computing

Date Deposited:

10 Jun 2022 18:14

Last Modified:

10 Jun 2024 05:15

URI:

http://d-scholarship.pitt.edu/id/eprint/42256

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

Evaluation of Parameter-Scaling for Efficient Deep Learning on Small Satellites

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds