Probability Based Node Coloring

In designing an "explanation" for a graphical model, we can take advantage of the fact that the graph already provides a natural model of explanation: it describes which variables are directly related. Coloring the node of the graph according to their probabilities exploits this natural explanation.

Coronary Artery Disease Example

We have already seen how effective node coloring is in the context of the context of the Diagnosis example. However, in order to explore more details of the explanation features, we introduce a new example, a model intended to predict the risk of Coronary Artery Disease. (This model was created from data first collected by Detrano et al. as described below.)

Heart Example with Node Coloring

The variable "Health-State" which occupies the central portion of the figure represents the state of coronary artery disease in the patients (note that in the source of this data it was measured by angiogram, an invasive procedure, so this is a model for fairly sick patients). The node "Healthy?" in the upper left hand corner represents a logical restriction of "Health-State" to a yes-no variable indicating the presence of coronary artery disease. The other variables are all observable, either as the result of direct observation or from a test.

The model start with a set of generic data representing the study population (because it is the study population and not the generic population, there is a relatively high baseline risk). We then modify the model to account for observations about a particular patient. In this case we have a 50 to 60 year old woman with High blood pressure and asymptomatic Chest Pain.

Positive and Negative State

Although the meaning of the colors was fairly clear in the reliability example where all the variables were binary, variables with more than two states present a problem. Graphical-Belief allows the analyst to assign the label "Positive" to any subset of variables, and the node is colored according the to probability that the variable falls in set of positive values. The analyst can change the set of positive states during the analysis to try and understand the behavior of the model.

In this particular example, the highest risk outcome for each observable value was labeled "Negative" and all others positive. For Health-State, only the complete absence of Coronary Artery Disease was considered Positive. From the picture we can see that the patient's gender and age are positive factors and that the patients blood pressure and asymptomatic chest pain are negative risk factors.

While this picture gives us a good overview of what is happening now, it tells us nothing about the strength of evidence. For example, the resting blood pressure seems to be as strong evidence as the the patient's age, even though observing age completely blocks the influence of blood pressure. To answer questions about strength of evidence, we can color the nodes weight of evidence.

Continue exploring explanation using Weight of Evidence Based Node Coloring.

Explanation Return to the beginning of the explanation examples.

Return to the main example page.

Back to overview of Graphical-Belief.

View a list of Graphical-Belief in publications and downloadable technical reports.

The Graphical-Belief user interface is implemented in Garnet.

Get more information about obtaining Graphical-Belief (and why it is not generally available).

get the home page for Russell Almond , author of Graphical-Belief.

Click here to get to the home page for Insightful (the company that StatSci has eventually evolved into).

Coronary Artery Disease Example Source

The coronary artery disease example comes from data first collected by:

Detrano R, Yinanikas J, Salcedo EE, Rincon G, Go RT, Williams G, and Leatherman J [1984]: ``Bayesian probability analysis: a prospective demonstration of its clinical utility in diagnosis coronary disease.'' Circulation, (69) 541--547.

and archived in:

Murphy, P.M. and D.W. Aha[1992]: UCI Repository of Machine Learning Databases. Online database maintained at the Department of Information and Computer Science, University of California, Irvine, CA.

To fit the model shown above, first we arbitrarily chose cut points to make continuous variables discrete. Then we fit a graphical structure using the program CoCo and the calculated probability tables for the cliques in the graph using S-Plus.

The technical report:

Almond, R.G. and Madigan, D. [1993]: ``Using GRAPHICAL-BELIEF to Predict Risk or Coronary Artery Disease.'' StatSci Research Report 19. (PDF) This example goes over a simple medical risk example fit to data from Detrano et al. [1989].

Describes the model fitting procedure in more detail.

Russell Almond, <lastname> (at) acm.org

Last modified: Mon Aug 19 15:58:20 1996