Wang, Zhepeng
(2021)
Enabling Deep Neural Networks with Oversized Working Memory on Resource-Constrained MCUs.
Master's Thesis, University of Pittsburgh.
(Unpublished)
This is the latest version of this item.
Abstract
Deep neural networks (DNNs) have shown their great power in effectively extracting features and making predictions from noisy input data, which makes them the most widely used algorithm in machine learning applications. In the meantime, microcontroller units (MCUs) have become the most common processors in our daily life. Therefore, integrating DNNs into MCUs will definitely make a huge impact on the real world. Despite its importance, little attention has been paid to the deployment of DNNs onto MCUs yet. DNNs are usually resource-intensive while MCUs are resource-constrained, which often makes it infeasible to directly run DNNs on MCUs. Apart from the low frequency (1-16 MHz) and limited storage (e.g., 64KB to 256KB ROM), one of the biggest challenges is the small RAM size (e.g., 2KB to 16KB), which is needed to save the intermediate feature maps of a DNN in the runtime. Most existing DNN compression algorithms aim to reduce the model size so that the model can fit into limited storage. However, these algorithms do not reduce the size of intermediate feature maps significantly, which is referred to as working memory and might exceed the capacity of RAM. Therefore, it is possible that DNNs cannot run on MCUs even after compression. To address this problem, this work proposes a technique to dynamically prune the activation values of the output feature maps in the runtime if necessary, such that intermediate feature maps can fit into limited RAM. Experimental results on SVHN and CIFAR-10 have shown that the proposed algorithm could significantly reduce the working memory of a DNN to satisfy the hard constraint of RAM size while maintaining satisfactory accuracy with relatively low overhead on memory and run-time latency.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
3 September 2021 |
Date Type: |
Publication |
Defense Date: |
12 July 2021 |
Approval Date: |
3 September 2021 |
Submission Date: |
9 July 2021 |
Access Restriction: |
No restriction; Release the ETD for access worldwide immediately. |
Number of Pages: |
38 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
Swanson School of Engineering > Electrical and Computer Engineering |
Degree: |
MS - Master of Science |
Thesis Type: |
Master's Thesis |
Refereed: |
Yes |
Uncontrolled Keywords: |
Neural Network Deployment, Neural Network Compression, Embedded System, Artificial Intelligence of Things (AIoT), On-Device Artificial Intelligence (AI) |
Date Deposited: |
03 Sep 2021 15:55 |
Last Modified: |
03 Sep 2021 15:55 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/41445 |
Available Versions of this Item
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |