Speech Decomposition and Enhancement

Yoo, Sungyub (2005) Speech Decomposition and Enhancement. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Preview

PDF
Primary Text
Download (3MB) | Preview

Abstract

The goal of this study is to investigate the roles of steady-state speech sounds and transitions between these sounds in the intelligibility of speech. The motivation for this approach is that the auditory system may be particularly sensitive to time-varying frequency edges, which in speech are produced primarily by transitions between vowels and consonants and within vowels. The possibility that selectively amplifying these edges may enhance speech intelligibility is examined. Computer algorithms to decompose speech into two different components were developed. One component, which is defined as a tonal component, was intended to predominately include formant activity. The second component, which is defined as a non-tonal component, was intended to predominately include transitions between and within formants.The approach to the decomposition is to use a set of time-varying filters whose center frequencies and bandwidths are controlled to identify the strongest formant components in speech. Each center frequency and bandwidth is estimated based on FM and AM information of each formant component. The tonal component is composed of the sum of the filter outputs. The non-tonal component is defined as the difference between the original speech signal and the tonal component.The relative energy and intelligibility of the tonal and non-tonal components were compared to the original speech. Psychoacoustic growth functions were used to assess the intelligibility. Most of the speech energy was in the tonal component, but this component had a significantly lower maximum word recognition than the original and non-tonal component had. The non-tonal component averaged 2% of the original speech energy, but this component had almost equal maximum word recognition as the original speech. The non-tonal component was amplified and recombined with the original speech to generate enhanced speech. The energy of the enhanced speech was adjusted to be equal to the original speech, and the intelligibility of the enhanced speech was compared to the original speech in background noise. The enhanced speech showed higher recognition scores at lower SNRs, and the differences were significant. The original and enhanced speech showed similar recognition scores at higher SNRs. These results suggest that amplification of transient information can enhance the speech in noise and this enhancement method is more effective at severe noise conditions.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Yoo, Sungyub	sungyoo@pitt.edu	SUNGYOO

ETD Committee:

Title	Member	Email Address	Pitt Username
Committee Chair	Boston, J. Robert	boston@engr.pitt.edu	BBN
Committee Member	El-Jaroudi, Amro A.	amro@ee.pitt.edu	AMRO
Committee Member	Li, Ching-Chung	ccl@engr.pitt.edu	CCL
Committee Member	Lee, Heung-no	hnlee@ee.pitt.edu
Committee CoChair	Durrant, John D.	durrant@csd.pitt.edu	DURRANT

Date:

14 October 2005

Date Type:

Completion

Defense Date:

29 June 2005

Approval Date:

14 October 2005

Submission Date:

1 July 2005

Access Restriction:

No restriction; Release the ETD for access worldwide immediately.

Institution:

University of Pittsburgh

Schools and Programs:

Swanson School of Engineering > Electrical Engineering

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

format; non-tonal; speech; speech decomposition; speech enhancement; tonal; transition

Other ID:

http://etd.library.pitt.edu/ETD/available/etd-07012005-135056/, etd-07012005-135056

Date Deposited:

10 Nov 2011 19:49

Last Modified:

15 Nov 2016 13:45

URI:

http://d-scholarship.pitt.edu/id/eprint/8246

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

Speech Decomposition and Enhancement

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds