Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

DESIGN & IMPLEMENTATION OF A REAL-TIME, SPEAKER-INDEPENDENT, CONTINUOUS SPEECH RECOGNITION SYSTEM WITH VLIW DIGITAL SIGNAL PROCESSOR ARCHITECTURE

Ng, Wai-Ting (2007) DESIGN & IMPLEMENTATION OF A REAL-TIME, SPEAKER-INDEPENDENT, CONTINUOUS SPEECH RECOGNITION SYSTEM WITH VLIW DIGITAL SIGNAL PROCESSOR ARCHITECTURE. Master's Thesis, University of Pittsburgh. (Unpublished)

[img]
Preview
PDF
Primary Text

Download (3MB) | Preview

Abstract

This thesis explores the feasibility of mapping a real-time, continuous speech recognition system onto a multi-core Digital Signal Processor architecture. While a pure hardware solution is capable of implementing the entire recognition process in real-time, the design process can be lengthy and inflexible to changes. However, a low-end embedded processor such as ARM7 is insufficient to execute in real-time. As a result, a more flexible and powerful DSP solution with Texas Instruments¡¦ C6713 multi-core DSP is used to exploit the instruction level parallelism within the speech recognition process. By exploiting the parallelism using 7 optimization techniques, the performance of the recognition process can be real-time on a 300 MHz DSP for a 1000 word vocabulary. At its core, continuous speech recognition is essentially a matching problem. The recognition process can be divided into four major phases: Feature Extraction, Acoustic Modeling, Phone Modeling and Word Modeling. Each phase is analyzed in detail to identify performance issues. In short, the major issues are its massive computations and large memory bandwidth. After applying various optimizations, the overall computational performance has improved from about 15 times slower than real-time to 1.6 times faster than real-time with the hardware. Through utilization of Direct Memory Access and larger cache memory, the memory bandwidth problem can be solved. The conclusion is that a multi-core DSP running at 300 MHz would be sufficient to implement a 1000 word Command & Control type application using the optimization techniques described in this thesis.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Ng, Wai-Tingng.johnny@gmail.com
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairHoare, Raymond Rhoare@engr.pitt.edu
Committee MemberJones, Alex Kakjones@engr.pitt.eduAKJONES
Committee MemberLevitan, Steven Psteve@ee.pitt.eduLEVITAN
Date: 13 June 2007
Date Type: Completion
Defense Date: 21 July 2006
Approval Date: 13 June 2007
Submission Date: 24 July 2006
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Institution: University of Pittsburgh
Schools and Programs: Swanson School of Engineering > Electrical Engineering
Degree: MSEE - Master of Science in Electrical Engineering
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: Continuous Speech Recognition; Digital Signal Processing; DSP; VLIW
Other ID: http://etd.library.pitt.edu/ETD/available/etd-07242006-120752/, etd-07242006-120752
Date Deposited: 10 Nov 2011 19:53
Last Modified: 15 Nov 2016 13:46
URI: http://d-scholarship.pitt.edu/id/eprint/8560

Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item