Meng, Rui
(2024)
Keyphrasification: Summarizing text into keyphrases - using neural language generation methods.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
This is the latest version of this item.
Abstract
Keyphrases encapsulate the core information of a text, acting as effective tools for organizing and retrieving extensive data. Their utility spans various applications, including information retrieval, document classification, and automatic summarization. Given the cost and limitations of manual keyphrase assignment, there has been a growing interest in automating this process. Traditional approaches to keyphrase assignment are categorized into extraction, which involves selecting phrases directly from the text, and tagging, where pre-defined tags are applied. Both methods often fail to address the complexity of natural language. For instance, a substantial fraction of keyphrases are absent from the source text and are missed by extraction methods. This observation highlights the need to reevaluate the paradigms within keyphrase studies and refine methodologies in automatic keyphrase prediction.
This dissertation introduces KKeyPhrasification to formulate the task of keyphrase prediction. By developing a conceptual framework and defining essential properties, this work aims to deepen the understanding of keyphrase prediction and facilitate the development of more effective techniques. Furthermore, I propose a novel modeling approach, keyphrase generation (KPGEN), utilizing neural language generation to learn the mapping between texts and keyphrases directly from data to predict contextually relevant phrases in varied forms. The dissertation further presents various enhancements and mechanisms to refine this approach.
This work makes several pivotal contributions. It reformulates keyphrase prediction as a specialized form of summarization, thereby broadening the previous research scope. It innovates in automatic keyphrasification with a data-driven approach, employing neural networks to predict context-relevant phrases, overcoming the limitations of prior methodologies. Furthermore, the study explores a range of advanced language generation techniques, from basic to pre-trained and large language models, making it a comprehensive investigation into the task of keyphrasification.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
13 May 2024 |
Date Type: |
Publication |
Defense Date: |
2 April 2024 |
Approval Date: |
13 May 2024 |
Submission Date: |
18 April 2024 |
Access Restriction: |
No restriction; Release the ETD for access worldwide immediately. |
Number of Pages: |
146 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
School of Information Sciences > Information Science |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
Keyphrase; Keyphrasification; Keyphrase Generation; Language Generation |
Date Deposited: |
13 May 2024 16:08 |
Last Modified: |
13 May 2024 16:08 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/46405 |
Available Versions of this Item
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |