Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Design & Implementation of a Co-processor for Embedded, Real-Time, Speaker-Independent, Continuous Speech Recognition System-on-a-Chip

Gupta, Kshitij (2006) Design & Implementation of a Co-processor for Embedded, Real-Time, Speaker-Independent, Continuous Speech Recognition System-on-a-Chip. Master's Thesis, University of Pittsburgh. (Unpublished)

Primary Text

Download (7MB) | Preview


This thesis aims to break the myth that multi-GHz machines are required for processing speaker-independent, continuous speech recognition based on full models performing full-precision computations in real-time. Through the design of a custom hardware architecture this research shows that 100 MHz is sufficient to process a 1,000 word dictionary in real-time. The design and implementation of the architecture is discussed in this thesis. It is shown that this implementation requires limited hardware resources and therefore can be incorporated as a dedicated speech recognition co-processor.The system comprises of three major blocks corresponding to Acoustic, Phonetic and Word Modeling. For maximum performance, each of the blocks has been implemented in a highly pipelined manner, thereby enabling the computation of several quantities simultaneously. Further, fewer computations implies lower power consumption. To achieve this, optimizations at every stage of the computations have been made by incorporating feedback which enables the computation of only active data at any given time instant. For ensuring a scalable implementation, a dynamic memory allocation scheme has also been incorporated which helps manage the internal memory.Amongst the three blocks, Acoustic Modeling contributes between 55-95% towards the overall computations performed by the system. Therefore special attention was paid onto the computations in Acoustic Modeling and a new computation reduction technique, bestN, is proposed. This technique addresses both the bandwidth requirement and the complexity of the computations. It is shown that for little loss in relative accuracy, only 8-bit integer micro-addition operations are required while traditional systems need numerous 32-bit multiply and add operations. This technique also helps address the bandwidth requirement of the system by requiring 1/8th the bandwidth of traditional methods, and for the same bus bandwidth, an 8x speedup in performance can be achieved.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairHoare, Raymond
Committee MemberJones, Alex Kakjones@ece.pitt.eduAKJONES
Committee MemberLevitan, Steven Psteve@ee.pitt.eduLEVITAN
Date: 27 September 2006
Date Type: Completion
Defense Date: 2 December 2005
Approval Date: 27 September 2006
Submission Date: 30 November 2005
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Institution: University of Pittsburgh
Schools and Programs: Swanson School of Engineering > Electrical Engineering
Degree: MSEE - Master of Science in Electrical Engineering
Thesis Type: Master's Thesis
Refereed: Yes
Uncontrolled Keywords: Custom Hardware Architecture; System- on-Chip (SoC); Embedded Systems; Co-processor; Speech Recognition; FPGA
Other ID:, etd-11302005-113007
Date Deposited: 10 Nov 2011 20:06
Last Modified: 15 Nov 2016 13:52


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item