Learning visual attributes from contextual explanations

Murrugarra-Llerena, Nils (2019) Learning visual attributes from contextual explanations. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Preview

PDF
Download (9MB) | Preview

Abstract

In computer vision, attributes are mid-level concepts shared across categories. They provide a natural communication between humans and machines for image retrieval. They also provide detailed information about objects. Finally, attributes can describe properties of unfamiliar objects. These are some very appealing properties of attributes, but learning attributes is a challenging task. Since attributes are less well-defined, capturing them with computational models poses a different set of challenges than capturing object categories does. There is a miscommunication of attributes between humans and machines, since machines may not understand what humans have in mind when referring to a particular attribute. Humans usually provide labels if an object or attribute is present or not without any explanation. However, attributes are more complex and may require explanations for a better understanding.

This Ph.D. thesis aims to tackle these challenges in learning automatic attribute predictive models. In particular, it focuses on enhancing attribute predictive power with contextual explanations. These explanations aim to enhance data quality with human knowledge, which can be expressed in the form of interactions and may be affected by our personality.

First, we emulate human learning skill to understand unfamiliar situations. Humans try to infer properties from what they already know (background knowledge). Hence, we study attribute learning in data-scarce and non-related domains emulating human understanding skills. We discover transferable knowledge to learn attributes from different domains.

Our previous project inspires us to request contextual explanations to improve attribute learning. Thus, we enhance attribute learning with context in the form of gaze, captioning, and sketches. Human gaze captures subconscious intuition and associates certain components to the meaning of an attribute. For example, gaze associates the tiptoe of a shoe to a pointy attribute. To complement this gaze representation, captioning follows conscious thinking with prior analysis. An annotator may analyze an image and may provide the following description: “This shoe is pointy because its sharp form at the tiptoe”. Finally, in image search, sketches provide a holistic view of an image query, which complement specific details encapsulated via attribute comparisons. To conclude, our methods with contextual explanations outperform many baselines via quantitative and qualitative evaluation.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Murrugarra-Llerena, Nils	nineil.cs@gmail.com	nim60

ETD Committee:

Title	Member	Email Address	Pitt Username
Committee Chair	Kovashka, Adriana	kovashka@cs.pitt.edu	AIK85
Committee Member	Hwa, Rebecca	hwa@cs.pitt.edu	reh23
Committee Member	Hauskrecht, Milos	milos@cs.pitt.edu	MILOS
Committee Member	He, Daqing	dah44@pitt.edu	dah44

Date:

30 August 2019

Date Type:

Publication

Defense Date:

12 April 2019

Approval Date:

30 August 2019

Submission Date:

5 August 2019

Access Restriction:

No restriction; Release the ETD for access worldwide immediately.

Number of Pages:

136

Institution:

University of Pittsburgh

Schools and Programs:

School of Computing and Information > Computer Science

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

computer vision, machine learning, attribute learning, metric learning, reinforcement learning, transfer learning.

Date Deposited:

30 Aug 2019 15:43

Last Modified:

30 Aug 2019 15:43

URI:

http://d-scholarship.pitt.edu/id/eprint/37075

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

Learning visual attributes from contextual explanations

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds