Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Asymptotic Normality and Rates of Convergence for Random Forests via Generalized U-statistics

Peng, Wei (2021) Asymptotic Normality and Rates of Convergence for Random Forests via Generalized U-statistics. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

This is the latest version of this item.

[img]
Preview
PDF
Download (1MB) | Preview

Abstract

Random forests are among the most popular off-the-shelf supervised learning algorithms. Despite their well-documented empirical success, however, until recently, few theoretical results were available to describe their performance and behavior. In this work we push beyond recent work on consistency and asymptotic normality by establishing rates of convergence for random forests and other supervised learning ensembles. We develop the notion of generalized U-statistics and show that within this framework, random forest predictions can remain asymptotically normal for larger subsample sizes and under weaker conditions than previously established. Moreover, we provide Berry-Esseen bounds in order to quantify the rate at which this convergence occurs, making explicit the roles of the subsample size and the number of trees in determining the distribution of random forest predictions. When these generalized estimators are reduced to their classical U-statistic form, the rates we establish are faster than any available in the existing literature. We also provide a consistency estimate of the variance of random forest and illustrate that quantifying the uncertainty of random forest is typically more expensive than obtaining the random forest itself.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: University of Pittsburgh ETD
Status: Unpublished
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Peng, Weiwep15@pitt.edu
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairMentch, Lucaslkm31@pitt.edu
Committee MemberWasserman, Larrylarrywasserman.cool@gmail.com
Committee MemberIyengar, Satishssi@pitt.edu
Committee MemberRen, Zhaozren@pitt.edu
Date: 3 May 2021
Date Type: Publication
Defense Date: 26 March 2021
Approval Date: 3 May 2021
Submission Date: 6 April 2021
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Number of Pages: 113
Institution: University of Pittsburgh
Schools and Programs: Dietrich School of Arts and Sciences > Statistics
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: random forests, generalized U-statistics, asymptotic normality, Berry-Esseen bound, variance estimation
Date Deposited: 03 May 2021 15:28
Last Modified: 03 May 2021 15:28
URI: http://d-scholarship.pitt.edu/id/eprint/40809

Available Versions of this Item


Metrics

Monthly Views for the past 3 years

Plum Analytics


Actions (login required)

View Item View Item