© 2015 Elsevier Inc. All rights reserved. A major difficulty in building Bayesian network (BN) models is the size of conditional probability tables, which grow exponentially in the number of parents. One way of dealing with this problem is through parametric conditional probability distributions that usually require only a number of parameters that is linear in the number of parents. In this paper, we introduce a new class of parametric models, the Probabilistic Independence of Causal Influences (PICI) models, that aim at lowering the number of parameters required to specify local probability distributions, but are still capable of efficiently modeling a variety of interactions. A subset of PICI models is decomposable and this leads to significantly faster inference as compared to models that cannot be decomposed. We present an application of the proposed method to learning dynamic BNs for modeling a woman's menstrual cycle. We show that PICI models are especially useful for parameter learning from small data sets and lead to higher parameter accuracy than when learning CPTs.

© Copyright 2015, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Probabilistic graphical models, such as Bayesian networks, are intuitive and theoretically sound tools for modeling uncertainty. A major problem with applying Bayesian networks in practice is that it is hard to judge whether a model fits well a case that it is supposed to solve. One way of expressing a possible dissonance between a model and a case is the surprise index, proposed by Habbema, which expresses the degree of surprise by the evidence given the model. While this measure reflects the intuition that the probability of a case should be judged in the context of a model, it is computationally intractable. In this paper, we propose an efficient way of approximating the surprise index.

© 2016 IEEE. In this paper we describe ModelPatient, a software application developed to allow health sciences educators to create and deliver educational cases that are based on and simulate real patient behavior. ModelPatient uses data from Electronic Medical Record Systems (EMRS) or from publically available medical data sets in combination with Bayesian network (BN) models to generate virtual patient (VP) cases. Because the underlying models are based on real data, each decision made by a learner affects outcome probabilities. Therefore the behavior of a VP reflects how a real patient with the same medical condition would have reacted to the learners' actions. We believe that data- and model-driven approaches to creating VPs would allow educators to create higher-fidelity teaching cases and offer richer educational experience to learners.

© Springer International Publishing Switzerland 2015. Most practical uses of Dynamic Bayesian Networks (DBNs) involve temporal influences of the first order, i.e., influences between neighboring time steps. This choice is a convenient approximation influenced by the existence of efficient algorithms for first order models and limitations of available tools. In this paper, we focus on the question whether constructing higher time-order models is worth the effort when the underlying system’s memory goes beyond the current state. We present the results of an experiment in which we successively introduce higher order DBN models monitoring woman’s monthly cycle and measure the accuracy of these models in estimating the fertile period around the day of ovulation. We show that higher order models are more accurate than first order models. However, we have also observed over-fitting and a resulting decrease in accuracy when the time order chosen is too high.

Pulmonary arterial hypertension (PAH) is a severe and often deadly disease, originating from an increase in pulmonary vascular resistance. Its prevention and treatment are of vital importance to public health. A group of medical researchers proposed a calculator for estimating the risk of dying from PAH, available for a variety of computing platforms and widely used by health-care professionals. The PAH Risk Calculator is based on the Cox's Proportional Hazard (CPH) Model, a popular statistical technique used in risk estimation and survival analysis, based on data from a thoroughly collected and maintained Registry to Evaluate Early and Long-term Pulmonary Arterial Hypertension Disease Management (REVEAL Registry). In this paper, we propose an alternative approach to calculating the risk of PAH that is based on a Bayesian network (BN) model. Our first step has been to create a BN model that mimics the CPH model at the foundation of the current PAH Risk Calculator. The BN-based calculator reproduces the results of the current PAH Risk Calculator exactly. Because Bayesian networks do not require the somewhat restrictive assumptions of the CPH model and can readily combine data with expert knowledge, we expect that our approach will lead to an improvement over the current calculator. We plan to (1) learn the parameters of the BN model from the data captured in the REVEAL Registry, and (2) enhance the resulting BN model with medical expert knowledge. We have been collaborating closely on both tasks with the authors of the original PAH Risk Calculator.

In this paper we describe ADMIT, a software application developed to assist the graduate admissions process at the University of Pittsburgh School of Information Sciences (SIS). ADMIT uses a Bayesian network model built from historical admissions data and academic performance records to predict how likely each applicant is to succeed. The system rank-orders applicants based on the probability of their success in the Master of Science in Information Science (MSIS) program and presents results as an ordered list and as a histogram to the admission committee members. The system also enables users to see a graphical representation of the model (a causal graph) and observe how each input data point affects the system’s suggestions.

In this talk, I will review various ways of evaluating models learned from data, starting from simple measures, such as accuracy, sensitivity and specificity, through more complex measures, such as ROC curves and calibration curves used in probabilistic systems, and finally confidence intervals over the results obtained from models and over evaluation measures.

© 2014 Springer International Publishing Switzerland. We propose a novel approach to learning parameters of canonical models from small data sets using a concept employed in regression analysis: weighted least squares method. We assess the performance of our method experimentally and show that it typically outperforms simple methods used in the literature in terms of accuracy of the learned conditional probability distributions.

Cox’s Proportional Hazards (CPH) model is quite likely the most popular modeling technique in survival analysis. While the CPH model is able to represent relationships between a collection of risks and their common eﬀect, Bayesian networks have become an attractive alter-native with far broader applications. Our paper focuses on a Bayesian network interpretation of the CPH model. We provide a method of en-coding knowledge from existing CPH models in the process of knowledge engineering for Bayesian networks. We compare the accuracy of the resulting Bayesian network to the CPH model, Kaplan-Meier estimate, and Bayesian network learned from data using the EM algorithm. Bayesian networks constructed from CPH model lead to much higher accuracy than other approaches, especially when the number of data records is very small.

A fundamental step in the PC causal discovery algorithm consists of testing for (conditional) independence. When the number of data records is very small, a classical statistical independence test is typically unable to reject the (null) independence hypothesis. In this paper, we are comparing two conﬂicting pieces of advice in the literature that in case of too few data records recommend (1) assuming dependence and (2) assuming independence. Our results show that assuming independence is a safer strategy in minimizing the structural distance between the causal structure that has generated the data and the discovered structure. We also propose a simple improvement on the PC algorithm that we call blacklisting. We demonstrate that blacklisting can lead to orders of magnitude savings in computation by avoiding unnecessary independence tests.

© 2014 Loghmanpour et al. This study investigated the use of Bayesian Networks (BNs) for left ventricular assist device (LVAD) therapy; a treatment for end-stage heart failure that has been steadily growing in popularity over the past decade. Despite this growth, the number of LVAD implants performed annually remains a small fraction of the estimated population of patients who might benefit from this treatment. We believe that this demonstrates a need for an accurate stratification tool that can help identify LVAD candidates at the most appropriate point in the course of their disease. We derived BNs to predict mortality at five endpoints utilizing the Interagency Registry for Mechanically Assisted Circulatory Support (INTERMACS) database: containing over 12,000 total enrolled patients from 153 hospital sites, collected since 2006 to the present day, and consisting of approximately 230 pre-implant clinical variables. Synthetic minority oversampling technique (SMOTE) was employed to address the uneven proportion of patients with negative outcomes and to improve the performance of the models. The resulting accuracy and area under the ROC curve (%) for predicted mortality were 30 day: 94.9 and 92.5; 90 day: 84.2 and 73.9; 6 month: 78.2 and 70.6; 1 year: 73.1 and 70.6; and 2 years: 71.4 and 70.8. To foster the translation of these models to clinical practice, they have been incorporated into a web-based application, the Cardiac Health Risk Stratification System (CHRiSS). As clinical experience with LVAD therapy continues to grow, and additional data is collected, we aim to continually update these BN models to improve their accuracy and maintain their relevance. Ongoing work also aims to extend the BN models to predict the risk of adverse events post-LVAD implant as additional factors for consideration in decision making.

© 2015 by the American Society for Artificial Internal Organs. Existing risk assessment tools for patient selection for left ventricular assist devices (LVADs) such as the Destination Therapy Risk Score and HeartMate II Risk Score (HMRS) have limited predictive ability. This study aims to overcome the limitations of traditional statistical methods by performing the first application of Bayesian analysis to the comprehensive Interagency Registry for Mechanically Assisted Circulatory Support dataset and comparing it to HMRS. We retrospectively analyzed 8,050 continuous flow LVAD patients and 226 preimplant variables. We then derived Bayesian models for mortality at each of five time end-points postimplant (30 days, 90 days, 6 month, 1 year, and 2 years), achieving accuracies of 95%, 90%, 90%, 83%, and 78%, Kappa values of 0.43, 0.37, 0.37, 0.45, and 0.43, and area under the receiver operator characteristic (ROC) of 91%, 82%, 82%, 80%, and 81%, respectively. This was in comparison to the HMRS with an ROC of 57% and 60% at 90 days and 1 year, respectively. Preimplant interventions, such as dialysis, ECMO, and ventilators were major contributing risk markers. Bayesian models have the ability to reliably represent the complex causal relations of multiple variables on clinical outcomes. Their potential to develop a reliable risk stratification tool for use in clinical decision making on LVAD patients encourages further investigation.

©2015 Elsevier B.V. All rights reserved. We compare three approaches to learning numerical parameters of discrete Bayesian networks from continuous data streams: (1) the EM algorithm applied to all data, (2) the EM algorithm applied to data increments, and (3) the online EM algorithm. Our results show that learning from all data at each step, whenever feasible, leads to the highest parameter accuracy and model classification accuracy. When facing computational limitations, incremental learning approaches are a reasonable alternative. While the differences in speed between incremental algorithms are not large (online EM is slightly slower), for all but small data sets online EM tends to be more accurate than incremental EM.

Copyright © 2014, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. We report the results of an empirical evaluation of structural simplification of Bayesian networks by removing weak arcs. We conduct a series of experiments on six networks built from real data sets selected from the UC Irvine Machine Learning Repository. We systematically remove arcs from the weakest to the strongest, relying on four measures of arc strength, and measure the classification accuracy of the resulting simplified models. Our results show that removing up to roughly 20 percent of the weakest arcs in a network has minimal effect on its classification accuracy. At the same time, structural simplification of networks leads to significant reduction of both the amount of memory taken by the clique tree and the amount of computation needed to perform inference.

While Bayesian network models may contain a handful of numerical parameters that are important for their quality, several empirical studies have confirmed that overall precision of their probabilities is not crucial. In this paper, we study the impact of the structure of a Bayesian network on the precision of medical diagnostic systems. We show that also the structure is not that important - diagnostic accuracy of several medical diagnostic models changes minimally when we subject their structures to such transformations as arc removal and arc reversal. © 2014 Springer International Publishing.

Objective: One of the hardest technical tasks in employing Bayesian network models in practice is obtaining their numerical parameters. In the light of this difficulty, a pressing question, one that has immediate implications on the knowledge engineering effort, is whether precision of these parameters is important. In this paper, we address experimentally the question whether medical diagnostic systems based on Bayesian networks are sensitive to precision of their parameters. Methods and materials: The test networks include Hepar II, a sizeable Bayesian network model for diagnosis of liver disorders and six other medical diagnostic networks constructed from medical data sets available through the Irvine Machine Learning Repository. Assuming that the original model parameters are perfectly accurate, we lower systematically their precision by rounding them to progressively courser scales and check the impact of this rounding on the models' accuracy. Results: Our main result, consistent across all tested networks, is that imprecision in numerical parameters has minimal impact on the diagnostic accuracy of models, as long as we avoid zeroes among parameters. Conclusion: The experiments' results provide evidence that as long as we avoid zeroes among model parameters, diagnostic accuracy of Bayesian network models does not suffer from decreased precision of their parameters. © 2013 Elsevier B.V.

One problem faced in knowledge engineering for Bayesian networks (BNs) is the exponential growth of the number of parameters in their conditional probability tables (CPTs). The most common practical solution is the application of the so-called canonical gates and, among them, the noisy-OR (or their generalization, the noisy-MAX) gates, which take advantage of the independence of causal interactions and provide a logarithmic reduction of the number of parameters required to specify a CPT. In this paper, we propose an algorithm that fits a noisy-MAX distribution to an existing CPT, and we apply this algorithm to search for noisy-MAX gates in three existing practical BN models: ALARM, HAILFINDER, and HEPAR II. We show that the noisy-MAX gate provides a surprisingly good fit for as many as 50% of CPTs in two of these networks. We observed this in both distributions elicited from experts and those learned from data. The importance of this finding is that it provides an empirical justification for the use of the noisy-MAX gate as a powerful knowledge engineering tool. © 2013 IEEE.

We compare three approaches to learning numerical parameters of Bayesian networks from continuous data streams: (1) the EM algorithm applied to all data, (2) the EM algorithm applied to data increments, and (3) the online EM algorithm. Our results show that learning from all data at each step, whenever feasible, leads to the highest parameter accuracy and model classification accuracy. When facing computational limitations, incremental learning approaches are a reasonable alternative. Of these, online EM is reasonably fast, and similar to the incremental EM algorithm in terms of accuracy. For small data sets, incremental EM seems to lead to better accuracy. When the data size gets large, online EM tends to be more accurate. Copyright © 2013, Association for the Advancement of Artificial Intelligence. All rights reserved.