Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Benchmarking Transformer-Based Transcription on Embedded GPUs for Space Applications

Schubert, Marika E. (2021) Benchmarking Transformer-Based Transcription on Embedded GPUs for Space Applications. Master's Thesis, University of Pittsburgh. (Unpublished)

[img] PDF
Restricted to University of Pittsburgh users only until 13 June 2023.

Download (744kB) | Request a Copy


Speech transcription is a necessary tool for backend applications commonly found in voice assistants. Transcription is typically performed using cloud-based servers or custom hardware, but those resources are not always amenable to space environments due to size, weight, power, and cost constraints. Therefore, it is important to determine the performance of and optimal conditions for running transcription on hardware that is feasible for deployment in a space application. This research investigates and evaluates the performance of an optimized version of the wav2vec2 speech transcription engine, the current state-of-the-art model for this domain. The target hardware, the NVIDIA Xavier NX Jetson embedded GPU, was chosen for its modern GPU architecture and small form factor. In addition to examining the input scaling behavior, we evaluate the hyperparameters of the clustered attention optimization, and average power and energy for inference relative to the operating power mode of the device. The clustered attention model outperformed the improved-clustered model for large input sizes, but the original wav2vec2 performed better for small input sizes. The clustered model energy per inference (13.90 J) was less than energy per inference of the improved-cluster model (15.03 J) and the vanilla model (15.85 J). All models meet real-time speech processing requirements necessary to perform onboard inference entirely on a space system.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Schubert, Marika E.marika.schubert@pitt.edumes389
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Thesis AdvisorGeorge, Alan
Committee MemberEl-Jaroudi,
Committee MemberHu,
Date: 13 June 2021
Date Type: Publication
Defense Date: 31 March 2021
Approval Date: 13 June 2021
Submission Date: 18 March 2021
Access Restriction: 2 year -- Restrict access to University of Pittsburgh for a period of 2 years.
Number of Pages: 37
Institution: University of Pittsburgh
Schools and Programs: Swanson School of Engineering > Electrical and Computer Engineering
Degree: MS - Master of Science
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: Automatic speech recognition, benchmarking, GPU, machine learning, optimization, parallel processing
Date Deposited: 13 Jun 2021 18:38
Last Modified: 13 Jun 2021 18:38


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item