Video Analysis by Deep Learning

Ramadan, Mona (2019) Video Analysis by Deep Learning. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Preview

PDF
Download (6MB) | Preview

Abstract

The tasks of automatically classifying the content of videos or predicting the outcome of a series of events occurring in a sequence of frames, while may sound simple, are still very challenging research areas despite the vast improvement in computing hardware and the easy access to large sets of data. In our work, we extend machine learning techniques to comprehend videos by tackling three challenging tasks: video classification on the full-length video level, video classification both on the level of actions performed in certain frames and the full-length video level, and action prediction of upcoming events.

Classification on the video level is a classic machine learning problem that has been addressed previously. We address this problem both using a standard deep learning approach, where a deep convolutional neural network (CNN) is trained on video frames then a Long Short Term Memory (LSTM) network is used to aggregate the features learned by the CNN into a single video label. And we introduce a different approach that uses still images of a data set that is independent on the video data set to train a CNN that is later used to classify a selection of video frames and make a conclusion about the video class. Our approach results in a classification accuracy that ranges between 91% and 94% when processing only 10 to 300 video frames, respectively, of the test videos on a subset of the YouTube Sports-1M dataset.
Classification on the actions level and the video class level is not a well-addressed problem. We tackle the challenge by using a hybrid CNN-Hidden Markov Model (HMM) system where a dictionary of actions is constructed from the training data and is used to detect a sequence of video actions then map this actions sequence into a video class for the entire video. Our approach detects the actions in videos of the Actions for Cooking Eggs (ACE) data set with an accuracy of 79% while classifying the videos with a 100% accuracy.

Finally, we address the problem of next action prediction by using the same hybrid CNN-HMM system to predict the next performed action when only part of the video is available. Our approach successfully predicts the next first and second performed actions in a video stream with a probability higher than 50% when 60% or more of the video is available for processing, with the prediction accuracy continuing to increase as the system gains access to more video frames.

Citation/Export:
Social Networking:	Share \|

Details

Item Type:

University of Pittsburgh ETD

Status:

Unpublished

Creators/Authors:

Creators	Email	Pitt Username	ORCID
Ramadan, Mona	mhr23@pitt.edu	mhr23	0000-0002-1999-0142

ETD Committee:

Title	Member	Email Address	Pitt Username
Committee Chair	El-Jaroudi, Amro	amro@pitt.edu	amro@pitt.edu
Committee Member	Sejdic, Ervin	esejdic@pitt.edu	esejdic@pitt.edu
Committee Member	Zhi-Hong, Mao	zhm4@pitt.edu	zhm4@pitt.edu
Committee Member	Akcakaya, Murat	akcakaya@pitt.edu	akcakaya@pitt.edu
Committee Member	Loughlin, Patrick	loughlin@pitt.edu	loughlin@pitt.edu
Thesis Advisor	El-Jaroudi, Amro	amro@pitt.edu	amro@pitt.edu

Date:

19 June 2019

Date Type:

Publication

Defense Date:

14 November 2018

Approval Date:

19 June 2019

Submission Date:

18 March 2019

Access Restriction:

No restriction; Release the ETD for access worldwide immediately.

Number of Pages:

113

Institution:

University of Pittsburgh

Schools and Programs:

Swanson School of Engineering > Electrical Engineering

Degree:

PhD - Doctor of Philosophy

Thesis Type:

Doctoral Dissertation

Refereed:

Yes

Uncontrolled Keywords:

Video classification, video prediction, deep learning

Date Deposited:

19 Jun 2019 15:00

Last Modified:

19 Jun 2019 15:00

URI:

http://d-scholarship.pitt.edu/id/eprint/36069

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item

My Account

Search

Browse

Information

Video Analysis by Deep Learning

Abstract

Share

Details

Metrics

Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

Connect with us

Send Comments or Questions

Feeds