Coleman, Timothy
(2021)
Advancing Inference in Supervised Learning Procedures via Permutation Tests and Importance Sampling, with Applications to Environmental Science.
Doctoral Dissertation, University of Pittsburgh.
(Unpublished)
Abstract
Random forests, since being proposed by Breiman (2001), have become popular supervised regression and classification techniques. Their popularity stems from being easy to implement - the default hyper-parameter settings are often not far from optimal and are often competitive with more involved supervised models. While random forests are complex, they are not completely impenetrable to theoretical analysis. In this thesis, we present several contributions to random forest methodology. First, we provide a motivating application of random forests to ornithological data, where we develop a novel hypothesis test for testing equality of distribution of random forest curves. Then, we refine an observation made during that application into a means of testing hypotheses about the validation error of random forests, allowing for computationally efficient tests that are analogous to the F-test for linear regression. Finally, we propose a means of accounting for a discrepancy in test and training distributions, motivated by the problem of forecasting power outages from hurricanes.
Share
Citation/Export: |
|
Social Networking: |
|
Details
Item Type: |
University of Pittsburgh ETD
|
Status: |
Unpublished |
Creators/Authors: |
|
ETD Committee: |
|
Date: |
20 January 2021 |
Date Type: |
Publication |
Defense Date: |
19 November 2020 |
Approval Date: |
20 January 2021 |
Submission Date: |
3 December 2020 |
Access Restriction: |
No restriction; Release the ETD for access worldwide immediately. |
Number of Pages: |
148 |
Institution: |
University of Pittsburgh |
Schools and Programs: |
Dietrich School of Arts and Sciences > Statistics |
Degree: |
PhD - Doctor of Philosophy |
Thesis Type: |
Doctoral Dissertation |
Refereed: |
Yes |
Uncontrolled Keywords: |
Random Forests, Machine Learning, Environmental Statistics |
Date Deposited: |
20 Jan 2021 18:21 |
Last Modified: |
20 Jan 2021 18:21 |
URI: |
http://d-scholarship.pitt.edu/id/eprint/39985 |
Metrics
Monthly Views for the past 3 years
Plum Analytics
Actions (login required)
|
View Item |