Mohit, Behrang
(2010)
LOCATING AND REDUCING TRANSLATIONDIFFICULTY.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
The challenge of translation varies from one sentence to another, or even between phrases of a sentence. We investigate whether variations in difficulty can be located automatically for Statistical Machine Translation (SMT). Furthermore, we hypothesize that customization of a SMT system based on difficulty information, improves the translation quality.We assume a binary categorization for phrases: easy vs. difficult. Our focus is on the Difficult to Translate Phrases (DTPs). Our experiments show that for a sentence, improving the translation of the DTP improves the translation of the surrounding non-difficult phrases too. To locate the most difficult phrase of each sentence, we use machine learning and construct a difficulty classifier. To improve the translation of DTPs, we introduce customization methods for three components of the SMT system: I. language model; II. translation model; III. decoding weights. With each method, we construct a new component that is dedicated for the translation of difficult phrases. Our experiments on Arabic-to-English translation show that DTP-specific system customization is mostly successful.Overall, we demonstrate that translation difficulty is an important source of information for machine translation and can be used to enhance its performance.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
30 September 2010 |
Date Type: |
Completion |
Defense Date: |
4 December 2009 |
Approval Date: |
30 September 2010 |
Submission Date: |
26 July 2010 |
Access Restriction: |
No restriction; Release the ETD for access worldwide immediately. |
Institution: |
University of Pittsburgh |
Schools and Programs: |
Dietrich School of Arts and Sciences > Intelligent Systems |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
System Customization; Translation Difficulty; Machine Translation; Model Adaptation |
Other ID: |
http://etd.library.pitt.edu/ETD/available/etd-07262010-165608/, etd-07262010-165608 |
Date Deposited: |
10 Nov 2011 19:54 |
Last Modified: |
15 Nov 2016 13:47 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/8629 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
 |
View Item |