A Vectorized Processing Algorithm for Continuous Speech Recognition and Associated FPGA-Based Architecture

Schuster, Jeffrey William (2006) A Vectorized Processing Algorithm for Continuous Speech Recognition and Associated FPGA-Based Architecture. Master's Thesis, University of Pittsburgh. (Unpublished)

Preview

PDF
Primary Text
Download (2MB) | Preview

Abstract

This work analyzes Continuous Automatic Speech Recognition (CSR) and in contrast to prior work, it shows that the CSR algorithms can be specified in a highly parallel form. Through use of the MATLAB software package, the parallelism is exploited to create a compact, vectorized algorithm that is able to execute the CSR task. After an in-depth analysis of the SPHINX 3 Large Vocabulary Continuous Speech Recognition (LVCSR) engine the major functional units were redesigned in the MATLAB environment, taking special effort to flatten the algorithms and restructure the data to allow for matrix-based computations. Performing this conversion resulted in reducing the original 14,000 lines of C++ code into less then 200 lines of highly-vectorized operations, substantially increasing the potential Instruction Line Parallelism of the system. Using this vector model as a baseline, a custom hardware system was then created that is capable of performing the speech recognition task in real-time on a Xilinx Virtex-4 FPGA device. Through the creation independent hardware engines for each stage of the speech recognition process, the throughput of each is maximized by customizing the logic to the specific task. Further, a unique architecture was designed that allows for the creation of a static data path throughout the hardware, effectively removing the need for complex bus arbitration in the system. By making using of shared memory resources and applying a token passing scheme to the system, both the data movement within the design as well as the amount of active data are continually minimized during run-time. These results provide a novel method for perform speech recognition in both hardware and software, helping to further the development of systems capable of recognizing human speech.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Schuster, Jeffrey William	jws52@pitt.edu	JWS52

ETD Committee:

Title	Member	Email Address	Pitt Username
Committee Chair	Hoare, Raymond R	hoare@engr.pitt.edu
Committee Member	Jones, Alex K	akj8@pitt.edu	AKJ8
Committee Member	Levitan, Steven P	steve@ee.pitt.edu	LEVITAN

Date:

27 September 2006

Date Type:

Completion

Defense Date:

29 April 2006

Approval Date:

27 September 2006

Submission Date:

10 April 2006

Access Restriction:

No restriction; Release the ETD for access worldwide immediately.

Institution:

University of Pittsburgh

Schools and Programs:

Swanson School of Engineering > Electrical Engineering

Degree:

MSEE - Master of Science in Electrical Engineering

Thesis Type:

Master's Thesis

Refereed:

Yes

Uncontrolled Keywords:

; acoustic modeling; Gaussian distributions; hidden markov models; MATLAB; phoneme evaluation; speech recognition

Other ID:

http://etd.library.pitt.edu/ETD/available/etd-04102006-155357/, etd-04102006-155357

Date Deposited:

10 Nov 2011 19:35

Last Modified:

15 Nov 2016 13:39

URI:

http://d-scholarship.pitt.edu/id/eprint/6949

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

A Vectorized Processing Algorithm for Continuous Speech Recognition and Associated FPGA-Based Architecture

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds