Applied Statistical Theory: Belief Networks

Applied statistical theory is a new series that will cover the basic methodology and framework behind various statistical procedures. As analysts, we need to know enough about what we’re doing to be dangerous and explain approaches to others. It’s not enough to say “I used X because the misclassification rate was low.” At the same time, we don’t need to have doctoral level understanding of approach X. I’m hoping that these posts will provide a simple, succinct middle ground for understanding various statistical techniques.

Probabilistic grphical models represent the conditional dependencies between random variables through a graph structure. Nodes correspond to random variables and edges represent statistical dependencies between the variables. Two variables are said to be conditionally dependent if they have a direct impact on each others’ values. Therefore, a graph with directed edges from parent $A_p$ and child $B_c$ denotes a causal relationship. Two variables are conditionally independent if the link between those variables are conditional on another. For a graph with directed edges from $A$ to $B$ and from $B$ to $C$ , it would suggest that $A$ and $C$ are conditionally independent given variable $B$ . Each node fits a probability distribution function that depends only on the value(s) of the variables with edges leading into the variable. For example, the probability distribution for variable $C$ in the following graphic depends only on the value of variable $B$ .

Let’s consider a graphical model with $K = (k_1, k_2, ... , k_n)$ variables and a set of dependencies between the variables, $A = (a_1, a_2, ... , a_n)$ . For each $K$ and $A$ , we denote a set of conditional probability distributions for each $K$ given the parent variable. In the following directed acyclic graph, we see that $P(A|B,C) = P(A|B)$ . This means that the probability of $A$ is conditionally dependent only on $B$ and the value of $C$ does not explain the other random variables. For belief networks, inference involves computing the probability of each value of a node in a network.