Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form


Mohit, Behrang (2010) LOCATING AND REDUCING TRANSLATIONDIFFICULTY. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Primary Text

Download (847kB) | Preview


The challenge of translation varies from one sentence to another, or even between phrases of a sentence. We investigate whether variations in difficulty can be located automatically for Statistical Machine Translation (SMT). Furthermore, we hypothesize that customization of a SMT system based on difficulty information, improves the translation quality.We assume a binary categorization for phrases: easy vs. difficult. Our focus is on the Difficult to Translate Phrases (DTPs). Our experiments show that for a sentence, improving the translation of the DTP improves the translation of the surrounding non-difficult phrases too. To locate the most difficult phrase of each sentence, we use machine learning and construct a difficulty classifier. To improve the translation of DTPs, we introduce customization methods for three components of the SMT system: I. language model; II. translation model; III. decoding weights. With each method, we construct a new component that is dedicated for the translation of difficult phrases. Our experiments on Arabic-to-English translation show that DTP-specific system customization is mostly successful.Overall, we demonstrate that translation difficulty is an important source of information for machine translation and can be used to enhance its performance.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairHwa, Rebeccahwa@cs.pitt.eduREH23
Committee MemberLavie,
Committee MemberHe, Daqingdaqing@mail.sis.pitt.eduDAH44
Committee MemberWiebe, Janycewiebe@cs.pitt.eduJMW106
Date: 30 September 2010
Date Type: Completion
Defense Date: 4 December 2009
Approval Date: 30 September 2010
Submission Date: 26 July 2010
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Institution: University of Pittsburgh
Schools and Programs: Dietrich School of Arts and Sciences > Intelligent Systems
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: System Customization; Translation Difficulty; Machine Translation; Model Adaptation
Other ID:, etd-07262010-165608
Date Deposited: 10 Nov 2011 19:54
Last Modified: 15 Nov 2016 13:47


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item