Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

A Methodology to Develop a Decision Model Using a Large Categorical Database with Application to Identifying Critical Variables during a Transport-Related Hazardous Materials Release

Clark, Renee M (2006) A Methodology to Develop a Decision Model Using a Large Categorical Database with Application to Identifying Critical Variables during a Transport-Related Hazardous Materials Release. Doctoral Dissertation, University of Pittsburgh. (Unpublished)

Primary Text

Download (6MB) | Preview


An important problem in the use of large categorical databases is extracting information to make decisions, including identification of critical variables. Due to the complexity of a dataset containing many records, variables, and categories, a methodology for simplification and measurement of associations is needed to build the decision model. To this end, the proposed methodology uses existing methods for categorical exploratory analysis. Specifically, latent class analysis and loglinear modeling, which together constitute a three-step, non-simultaneous approach, were used to simplify the variables and measure their associations, respectively. This methodology has not been used to extract data-driven decision models from large categorical databases. A case in point is a large categorical database at the DoT for hazardous materials releases during transportation. This dataset is important due to the risk from an unintentional release. However, due to the lack of a data-congruent decision model of a hazmat release, current decision making, including critical variable identification, is limited at the Office of Hazardous Materials within the DoT. This gap in modeling of a release is paralleled by a similar gap in the hazmat transportation literature. The literature has an operations research and quantitative risk assessment focus, in which the models consist of simple risk equations or more complex, theoretical equations. Thus, based on critical opportunities at the DoT and gaps in the literature, the proposed methodology was demonstrated using the hazmat release database. The methodology can be applied to other categorical databases for extracting decision models, such as those at the National Center for Health Statistics. A key goal of the decision model, a Bayesian network, was identification of the most influential variables relative to two consequences or measures of risk in a hazmat release, dollar loss and release quantity. The most influential variables for dollar loss were found to be variables related to container failure, specifically the causing object and item-area of failure on the container. Similarly, for release quantity, the container failure variables were also most influential, specifically the contributing action and failure mode. In addition, potential changes in these variables for reducing consequences were identified.


Social Networking:
Share |


Item Type: University of Pittsburgh ETD
Status: Unpublished
CreatorsEmailPitt UsernameORCID
Clark, Renee
ETD Committee:
TitleMemberEmail AddressPitt UsernameORCID
Committee ChairBesterfield-Sacre, Mary E.mbsacre@engr.pitt.eduMBSACRE
Committee MemberWolfe, Harveyhwolfe@engr.pitt.eduHWOLFE
Committee MemberRajgopal, Jayantrajgopal@engr.pitt.eduGUNNER1
Committee MemberShuman, Larry J.shuman@engr.pitt.eduSHUMAN
Committee MemberDay, Richard D.rdfac@pitt.eduRDFAC
Date: 2 June 2006
Date Type: Completion
Defense Date: 14 December 2005
Approval Date: 2 June 2006
Submission Date: 22 March 2006
Access Restriction: No restriction; Release the ETD for access worldwide immediately.
Institution: University of Pittsburgh
Schools and Programs: Swanson School of Engineering > Industrial Engineering
Degree: PhD - Doctor of Philosophy
Thesis Type: Doctoral Dissertation
Refereed: Yes
Uncontrolled Keywords: correction procedure; data driven; DoT; engineering policy; latent class analysis; latent variable; LCA; log-linear model; loglinear model; Modified LISREL; OHM; sparse table; Three Step model; Bayesian network; unload
Other ID:, etd-03222006-220811
Date Deposited: 10 Nov 2011 19:32
Last Modified: 15 Nov 2016 13:37


Monthly Views for the past 3 years

Plum Analytics

Actions (login required)

View Item View Item