Huang, Kai
(2024)
Bringing Agile and Self-Evolvable Intelligence to Weak Embedded Devices.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
Neural Networks (NNs) can significantly enhance perception and decision-making in resource-constrained devices like drones and wearables. However, limited resources such as memory and computing power hinder modern NN designs, leading to inaccurate predictions and delayed execution.
To first ensure dependable inference, we propose modular NN structures mimicking expert decision-making. We study the effectiveness of our methodology in wireless backscatter systems under noisy channel conditions, where a modular NN is tailored for predictive power adaptation. Despite NN structure advancements, microcontroller-equipped devices still face performance barriers under extreme constraints, such as limited memory ($<$1MB) and low clock frequency ($<$300MHz). To enable more efficient use of limited resources, we propose agile offloading, which uses the patterns of feature importance identified by explainable AI to enhance the offloading efficiency. Due to the non-stationary world, NN models should also be promptly retrained d with new data, allowing it to continuously adapt to environmental dynamics and maintain its accuracy. To achieve this adaptivity, we propose a selective training scheme, where NN substructures can be freely added or skipped at runtime based on their importance with user desired computational costs. We showcase the effectiveness of our scheme on both vision and Large Language Models (LLMs). In addition to retraining upon a stationary structure, we further envision that the NN structure should be runtime expandable to accept more data modalities captured by the device. Such self-evolvability can improve the NN’s generative and reasoning capabilities in more complex tasks like autonomous navigation and human-device interaction. However, as more data modalities are incorporated, continuously enlarged models encounter scalability challenges. To mitigate training costs, we propose connecting unimodal encoders to a flexible set of last LLM blocks, training only such latent connections at runtime. We showcase its improved accuracy-compute efficiency in multimodal question-answering tasks for autonomous driving scenarios.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
3 June 2024 |
Date Type: |
Publication |
Defense Date: |
29 March 2024 |
Approval Date: |
3 June 2024 |
Submission Date: |
8 March 2024 |
Access Restriction: |
No restriction; Release the ETD for access worldwide immediately. |
Number of Pages: |
191 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
Swanson School of Engineering > Electrical and Computer Engineering |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
Efficient AI, On-Device AI |
Date Deposited: |
03 Jun 2024 14:36 |
Last Modified: |
03 Jun 2024 14:36 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/45838 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
 |
View Item |