LIU, CHANGSHENG
(2019)
Toward Robust and Efficient Interpretations of Idiomatic Expressions in Context.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
Studies show that a large number of idioms can be interpreted figuratively or literally depending on their contexts. This usage ambiguity has negative impacts on many natural language processing (NLP) applications. In this thesis, we investigate methods of building robust and efficient usage recognizers by modeling interactions between contexts and idioms.
We aim to address three problems. First, how do differences in idioms’ linguistic properties affect the performances of automatic usage recognizers? We analyze the interactions between context representations and linguistic properties of idioms and develop ensemble models that predict usages adaptively for different idioms. Second, can an automatic usage recognizer be developed without annotated training examples? We develop a method for estimating the semantic distance between context and components of an idiom and then use that as distant supervision to guide further unsupervised clustering of usages. Third, how can we build one generalized model that reliably predicts the correct usage for a wide range of idioms, despite of variations in their linguistic properties? We recast this as a problem of modeling semantic compatibility between the literal interpretation of an arbitrary idiom and its context. We show that a general model of semantic compatibility can be trained from a large unannotated corpus, and that the resulting model can be applied to an arbitrary idiom without specific parameter tuning.
To demonstrate that our work can benefit downstream NLP applications, we perform a case study on machine translation. It shows that our model can help to improve the translation quality of sentences containing idioms.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
20 June 2019 |
Date Type: |
Publication |
Defense Date: |
27 March 2019 |
Approval Date: |
20 June 2019 |
Submission Date: |
11 April 2019 |
Access Restriction: |
No restriction; Release the ETD for access worldwide immediately. |
Number of Pages: |
109 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
Dietrich School of Arts and Sciences > Computer Science |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
Idiomatic Expressions, Figurative language |
Related URLs: |
|
Date Deposited: |
20 Jun 2019 16:13 |
Last Modified: |
20 Jun 2019 16:13 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/36404 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |