Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form


Du, Siqing (2008) ON THE USE OF NATURAL LANGUAGE PROCESSING FOR AUTOMATED CONCEPTUAL DATA MODELING. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Primary Text

Download (1MB) | Preview


This research involved the development of a natural language processing (NLP) architecture for the extraction of entity relation diagrams (ERDs) from natural language requirements specifications. Conceptual data modeling plays an important role in database and software design and many approaches to automating and developing software tools for this process have been attempted. NLP approaches to this problem appear to be plausible because compared to general free texts, natural language requirements documents are relatively formal and exhibit some special regularities which reduce the complexity of the problem. The approach taken here involves a loose integration of several linguistic components. Outputs from syntactic parsing are used by a set of hueristic rules developed for this particular domain to produce tuples representing the underlying meanings of the propositions in the documents and semantic resources are used to distinguish between correct and incorrect tuples. Finally the tuples are integrated into full ERD representations. The major challenge addressed in this research is how to bring the various resources to bear on the translation of the natural language documents into the formal language. This system is taken to be representative of a potential class of similar systems designed to translate documents in other restricted domains into corresponding formalisms. The system is incorporated into a tool that presents the final ERDs to users who can modify them in the attempt to produce an accurate ERD for the requirements document. An experiment demonstrated that users with limited experience in ERD specifications could produce better representations of requirements documents than they could without the system, and could do so in less time.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Du, Siqingsid2@pitt.eduSID2
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairMetzler, Douglasmetzler@sis.pitt.eduDMETZLER
Committee MemberRebecca, Hwahwa@cs.pitt.eduREH23
Committee MemberJames, Joshijjoshi@sis.pitt.eduJJOSHI
Committee MemberHassan, Karimihkarimi@sis.pitt.eduHKARIMI
Committee MemberVladimir, Zadorozhnyvladimir@mail.sis.pitt.eduVIZ
Date: 13 August 2008
Date Type: Completion
Defense Date: 25 June 2008
Approval Date: 13 August 2008
Submission Date: 7 August 2008
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Institution: University of Pittsburgh
Schools and Programs: School of Information Sciences > Information Science
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Automated database design; ERD; Natural Language Processing; NLP
Other ID:, etd-08072008-004514
Date Deposited: 10 Nov 2011 19:57
Last Modified: 15 Nov 2016 13:48


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item