Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Generation of Classificatory Metadata for Web Resources using Social Tags

Syn, Sue Yeon (2010) Generation of Classificatory Metadata for Web Resources using Social Tags. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Primary Text

Download (2MB) | Preview


With the increasing popularity of social tagging systems, the potential for using social tags as a source of metadata is being explored. Social tagging systems can simplify the involvement of a large number of users and improve the metadata generation process, especially for semantic metadata. This research aims to find a method to categorize web resources using social tags as metadata. In this research, social tagging systems are a mechanism to allow non-professional catalogers to participate in metadata generation. Because social tags are not from a controlled vocabulary, there are issues that have to be addressed in finding quality terms to represent the content of a resource. This research examines ways to deal with those issues to obtain a set of tags representing the resource from the tags provided by users.Two measurements that measure the importance of a tag are introduced. Annotation Dominance (AD) is a measurement of how much a tag term is agreed to by users. Another is Cross Resources Annotation Discrimination (CRAD), a measurement to discriminate tags in the collection. It is designed to remove tags that are used broadly or narrowly in the collection. Further, the study suggests a process to identify and to manage compound tags. The research aims to select important annotations (meta-terms) and remove meaningless ones (noise) from the tag set. This study, therefore, suggests two main measurements for getting a subset of tags with classification potential. To evaluate the proposed approach to find classificatory metadata candidates, we rely on users' relevance judgments comparing suggested tag terms and expert metadata terms. Human judges rate how relevant each term is on an n-point scale based on the relevance of each of the terms for the given resource.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Syn, Sue Yeonsus16@pitt.eduSUS16
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairSpring, Michael Bspring@pitt.eduSPRING
Committee MemberButler,
Committee MemberHe, Daqingdaqing@pitt.eduDAQING
Committee MemberBrusilovsky, Peterpeterb@pitt.eduPETERB0000-0002-1902-1464
Committee MemberHirtle, Stephen C.hirtle@pitt.eduHIRTLE
Date: 23 December 2010
Date Type: Completion
Defense Date: 10 December 2010
Approval Date: 23 December 2010
Submission Date: 20 December 2010
Access Restriction: 5 year -- Restrict access to University of Pittsburgh for a period of 5 years.
Institution: University of Pittsburgh
Schools and Programs: School of Information Sciences > Information Science
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: Classificatory Metadata; Metadata; Metadata Generation; Social Tags; Web Classification
Other ID:, etd-12202010-095038
Date Deposited: 10 Nov 2011 20:11
Last Modified: 15 Nov 2016 13:54


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item