Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Toward Robust and Efficient Interpretations of Idiomatic Expressions in Context

LIU, CHANGSHENG (2019) Toward Robust and Efficient Interpretations of Idiomatic Expressions in Context. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Download (1MB) | Preview


Studies show that a large number of idioms can be interpreted figuratively or literally depending on their contexts. This usage ambiguity has negative impacts on many natural language processing (NLP) applications. In this thesis, we investigate methods of building robust and efficient usage recognizers by modeling interactions between contexts and idioms.

We aim to address three problems. First, how do differences in idioms’ linguistic properties affect the performances of automatic usage recognizers? We analyze the interactions between context representations and linguistic properties of idioms and develop ensemble models that predict usages adaptively for different idioms. Second, can an automatic usage recognizer be developed without annotated training examples? We develop a method for estimating the semantic distance between context and components of an idiom and then use that as distant supervision to guide further unsupervised clustering of usages. Third, how can we build one generalized model that reliably predicts the correct usage for a wide range of idioms, despite of variations in their linguistic properties? We recast this as a problem of modeling semantic compatibility between the literal interpretation of an arbitrary idiom and its context. We show that a general model of semantic compatibility can be trained from a large unannotated corpus, and that the resulting model can be applied to an arbitrary idiom without specific parameter tuning.

To demonstrate that our work can benefit downstream NLP applications, we perform a case study on machine translation. It shows that our model can help to improve the translation quality of sentences containing idioms.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
LIU, CHANGSHENGliucs1986@gmail.comchl180
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairHwa,
Committee MemberKovashka,
Committee MemberLitman,
Committee MemberTsvetkov,
Date: 20 June 2019
Date Type: Publication
Defense Date: 27 March 2019
Approval Date: 20 June 2019
Submission Date: 11 April 2019
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 109
Institution: University of Pittsburgh
Schools and Programs: Dietrich School of Arts and Sciences > Computer Science
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Idiomatic Expressions, Figurative language
Related URLs:
Date Deposited: 20 Jun 2019 16:13
Last Modified: 20 Jun 2019 16:13


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item