Benchmarking Transformer-Based Transcription on Embedded GPUs for Space Applications

Schubert, Marika E. (2021) Benchmarking Transformer-Based Transcription on Embedded GPUs for Space Applications. Master's Thesis, University of Pittsburgh. (Unpublished)

Preview

PDF
Download (744kB) | Preview

Abstract

Speech transcription is a necessary tool for backend applications commonly found in voice assistants. Transcription is typically performed using cloud-based servers or custom hardware, but those resources are not always amenable to space environments due to size, weight, power, and cost constraints. Therefore, it is important to determine the performance of and optimal conditions for running transcription on hardware that is feasible for deployment in a space application. This research investigates and evaluates the performance of an optimized version of the wav2vec2 speech transcription engine, the current state-of-the-art model for this domain. The target hardware, the NVIDIA Xavier NX Jetson embedded GPU, was chosen for its modern GPU architecture and small form factor. In addition to examining the input scaling behavior, we evaluate the hyperparameters of the clustered attention optimization, and average power and energy for inference relative to the operating power mode of the device. The clustered attention model outperformed the improved-clustered model for large input sizes, but the original wav2vec2 performed better for small input sizes. The clustered model energy per inference (13.90 J) was less than energy per inference of the improved-cluster model (15.03 J) and the vanilla model (15.85 J). All models meet real-time speech processing requirements necessary to perform onboard inference entirely on a space system.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Schubert, Marika E.	marika.schubert@pitt.edu	mes389

ETD Committee:

Title	Member	Email Address
Thesis Advisor	George, Alan D.	alan.george@pitt.edu
Committee Member	El-Jaroudi, Amro	amro@pitt.edu
Committee Member	Hu, Jingtong	jthu@pitt.edu

Date:

13 June 2021

Date Type:

Publication

Defense Date:

31 March 2021

Approval Date:

13 June 2021

Submission Date:

18 March 2021

Access Restriction:

2 year -- Restrict access to University of Pittsburgh for a period of 2 years.

Number of Pages:

Institution:

University of Pittsburgh

Schools and Programs:

Swanson School of Engineering > Electrical and Computer Engineering

Degree:

MS - Master of Science

Thesis Type:

Master's Thesis

Refereed:

Yes

Uncontrolled Keywords:

Automatic speech recognition, benchmarking, GPU, machine learning, optimization, parallel processing

Date Deposited:

13 Jun 2021 18:38

Last Modified:

13 Jun 2023 05:15

URI:

http://d-scholarship.pitt.edu/id/eprint/40387

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

Benchmarking Transformer-Based Transcription on Embedded GPUs for Space Applications

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds