Chapter 1

Written solutions to selected exercises in Bayesian Networks and Decision Graphs.

“See the Hugin network” refers to the .net files http://www.cs.auc.dk/~fvj/BNIDpage/Hugin_models/models.html

Chapter 1

Exercise 1.1.

2/3, 1/3, ½
Close to zero. Notice that the certainty resulting from the combined action is much smaller than the minimum of the effects of single actions.

Exercise 1.2. The structure is similar to the structure of the "Car Start" network

Exercise 1.3. In (a), all variables are d-connected to A. In (b), all variables except C and F are d-connected to A.

Exercise 1.4. Consider a path from A to B, and let C be the neighbour to A on that path. Then C is instantiated. If the C-connection is serial or diverging, then C blocks the path. If the link is converging, consider C’s neighbour, D, on the path. As A and D have the common child, C, D is in the Markov blanket, and must be instantiated. Since the direction on the (D, C) link is from D to C, the D-connection on the path from A through C and D cannot be converging, and therefore D blocks the path.

Exercise 1.5. (a), (b) and (c) are I-equivalent.

Exercise 1.6. P(A) = (0.2, 0.4, 0.4); P(B) = (0.3, 0.3, 0.4);

P(A| b₁) = (0.167, 0.5, 0.333), P(A| b₂) = (0.333, 0, 0.667), P(A| b₃) = (0.125,, 0.625, 0.25);

P(B| a₁) = (0.25, 0.5, 0.25), P(B| a₂) = (0.375, 0, 0.625), P(B| a₃) = (0.25, 0.5, 0.25).

Exercise 1.7

P(B, c₁) = (0.02, 0.08), P(B, c₂) = (0.18, 0.72), P(B) = 0.2, 0.8).
P(A| b₁, c₁) = (0.3, 0.7) = P(A| b₁, c₂), P(A| b₂, c₁) = (0.6, 0.4) = P(A| b₂, c₂).

Exercise 1.8

As B is d-separated from C given A we have P(B| A, C) = P(B| A).
P(A, B, C) = P(A)P(B| A)P( C| A). For example, P(A, b₁, c₁) = (0.01, 0.09)

Exercise 1.9

0.996
0.498

Exercise 1.10.

If A and C are independent given B then P(A | C, B) = P(A | B). We have P(A, C| B) = P(A| C, B)P(C| B) = P(A| B)P(C| B)

Exercise 1.12 See the Hugin network

Exercise 1.13 See the Hugin network

Exercise 1.14 The graph in Figure 1.20 is wrong in the first printing. It should be A->B->C. See the Hugin network

Exercise 1.15 See the Hugin network

Chapter 2

Exercise 2.2 See the Hugin network

Exercise 2.3 See the Hugin network

Exercise 2.4 See the Hugin network

Exercise 2.6 The system gives P(UT=y, BT=y) = 0.441

Exercise 2.8

i. See the Hugin network

ii. Out of 10.000 we very day get two new infected, and as the number shall be stable, two of the seven must be cured. This is not quite exact. The exact number is 0.9993 times 2/7

iii. The same (approximate) reasoning as before

Exercise 2.10 See the Hugin networks. Note that in (iii) the reference shall be to (ii) rather than (i).

Exercise 2.11 See the Hugin network

Exercise 2.12 See the Hugin networks

Exercise 2.13 See the Hugin network

Exercise 2.14 See the Hugin network

Exercise 2.16 See the Hugin network

Exercise 2.19 See the partially completed Hugin networks (it may be that Hugin Lite will not read the networks due to too many states).

Exercise 2.20 My computer did only allow four time slices

Exercise 2.21

See the network. As the probability of "Result= y" is positive, there are assignments of truth values making the expression true.
Insert A=n and B=n as evidence and propagate. As P(Result=y)>0, there assignments of the remaining variables making the expression true. If you insert Result=y and propagate you see that the assignments must be C=y, D=y, E=y and F=y.
Assume that E is in conjunctive normal form. Construct a Bayesian network by introducing a variable for each Boolean variable, a variable for each conjunct and a result variable. The parents of a conjunct variable are the Boolean variables in it, and the conjunct variables are related to the result variable through an "and"-relation. Give the Boolean variable a fifty-fifty prior. If the resulting probability of Result=y is positive, then there is at least one configuration over all variables – with Result=y – with a positive probability. Then this configuration is not inconsistent.
In iii) we have reduced the satisfiability problem to probability propagation in bayesian networks. The BN-representation is polynomial in length of the representation of the satisfiability problem.

Exercise 2.22

The system (Hugin) responds with an error message "Inconsistence or overflow"
The posterior probability distributions are P(Cold?| e) = (0.99, 0.01);

P(Angina?| e) = (0, 0.98, 0.02); P(e) = 3.60^.10^-6. Through entering of virtual evidence we calculate P(Angina?| Cold?, e). Multiplying this with P(Cold?| e) from above we get

...............Angina?

Cold?.....no......mild....severe

.....no-----0------0.97-----0.02

..... yes ---0------0.01------0

The posterior distributions after max-propagation are Max(Cold?| e) = (1, 0.01) and Max(Angina?| e) = (0, 1, 0.02). This shows that (Cold? = no, Angina? = mild) is the most probable posterior configuration.
Conf(e) = log₂( 0.82^.0.99^.0.002)-log₂(3.6^.10^-6) = log₂428.45 = 8.7. This indicates a strong conflict.
Let P(Angina? = mild, e) = x(s) = a s+b and P(e) = y(s) = as+b. Then P(Angina?| e) = x(s)/y(s).

To determine the constants we have from above: 0.01a+b = 3.60^.10^-6, and through entering the virtual evidence Angina?=mild we get 0..01a +b = 3.53^.10^-6.

We change the parameter s to 0.02, and we get 0.02a+b = 7.13^.10^-6 and 0.02a +b = 7.05^.10^-6. We now solve the two pairs of linear equations and get a=3.53^.10^-4, b=7^.10^-8, a =3.52^.10^-4, b =10^-8. We have P(Angina?=mild| e) = (3.52s + 0.0001)/(3.53s + 0.0007).

Chapter 3

Exercise 3.1. dist_Q(x, y) = å _ix_i(1-2y_i + å _jy_j²-1+2x_i - å _jx_j²) = å _ix_i(-2y_i + å _jy_j²+2x_i - å _jx_j²) = å _ix_i(-2y_i +2x_i) + å _ix_i( å _jy_j² - å _jx_j²)

As å _ix_i = 1, we get dist_Q(x, y) = å _jy_j² - å _jx_j²+ å _ix_i(-2y_i +2x_i) = å _iy_i² - å _ix_i²+ å _ix_i(-2y_i +2x_i) = å _i ( y_i² - x_i²- 2x_iy_i +2x_i² ) = å _i(x_i-y_i)²

Exercise 3.3. See the Hugin network for 12 candidate models. All other models are either too large or submodels of candidate models (if I have found them all)

Exercise 3.5

See the Hugin network
P(expert 1) = (0.2, 0.8). Use the network above
P(expert 1) = 0.02 and P(B=y| A=y) = 0.61. Use the network above.

Exercise 3.6

Sample size 22, P(B=y| A=y) = 15/22 = 0.68
Sample size 10.7, P(B=y| A=y) = 0.69

Exercise 3.7. By a mistake, a parameter too much is mentioned. Ignore the parameter u.

i. After adaptation, P(A) has the sample sizes (22½, 22½), and P(B| ~a) has the sample sizes (12, 8)

ii. The order of the cases is important. If for A the cases come a, ~a, a, ~a, …, the samples sizes will stay very close to (12½, 12½). If the cases come with first 10 a’s and then 10 ~a’s, the sample sizes will be (11.095, 13.905)

Exercise 3.8 See the Hugin network with parameters after the first step. For this situation we get P(c) = 0.84 and P(T=1| c) = 0.57. The formulas in Proposition 3.2 yield P(c) = -0.255t +1. We get P(a, c) = 0.36 and the formulas yield P(a| c) = -t+1. The partial derivative of P(a| c) with respect to t is 0.745/(1-0.255t)².

Chapter 4

Exercise 4.1

i) EU(Gd) = 14.7; EU(SB) = 15.5; EU(Dg) = 15.1

Let v_i denote the utility attached to the mark i. Then EU(Gd)-EU(Sb) = 0.1v₀-0.1v₃. Hence Gd can only be better than Sb if the mark 0 is given higher utility than the mark 3.

Exercise 4.2 See the Hugin network

Exercise 4.5

Myopic: See the Hugin network. Initilally we can read that the value of the initial situation is V(P(Pr)) = –12.35, P(BT) = (0.57, 0.43) and P(UT) = (0.65, 0.35). We can use the network to calculate the expected utility of the optimal action after each possible outcome of a test. We get V(P(Pr| BT=y) = -2.29, V(P(Pr| BT=n) = -25.7. Hence the expected value EV(BT) = -12.35. No change and it does not pay to perform a blood test. The same holds for the urine test. There is an easy way to see this: enter the possible test outcomes and see if the optimal action is changed. If the optimal action is the same no matter the outcome, there is no reason to perform the test.

Non-myopic: The best information basis will be to have both tests performed. If still the optimal action is the same no matter the outcomes of the two tests, then no test sequence can make me change the current optimal action (to wait). The best indication for repeating the insemination is a negative test result from both tests. Insert this to the model, and notice that the optimal action to wait.

Exercise 4.7 The optimal strategy is to choose a₃. If the observation is the upper branch then choose d₂, if the observation is the lower branch, then choose e₂. The expected utility of the strategy is 1.905.

Exercise 4.8

Take the probabilities from the Hugin network. The strategy is to test, and if the result is n then do not drill – else drill. The expected utility of the strategy is $ 22500.
Take the probabilities and values of the information situation from the Hugin network (the expected utility of the optimal action). The result is as for i).
See the Hugin network

Exercise 4.14 See the Hugin network

Exercise 4.15 See the Hugin network

Exercise 4.17 See the Hugin network

Exercise 4.18 See the Hugin frame-network

Exercise 4.20 See the Hugin network

Chapter 5

Exercise 5.4

3⁶ = 729
3⁴+3³+3³+3³+9 = 171
The elimination sequence F, D, E, B, C yields 4^.3³+9 = 117

Exercise 5.5 i) and ii) New links (fill-ins): (A₁, A₅), (A₂, A₄), (A₄, A₅).

Exercise 5.7 Elimination order: A₆, A₅, A₄, A₃, A₂, A₁

Exercise 5.9 ii) No. After elimination of H, G, F you are stuck.

Exercise 5.10

G and A
No. After eliminating G and A you are stuck

Exercise 5.13

The cliques are exactly the domains of the potentials.
(A₁, A₃, A₄) – (A₁, A₂, A₃) - (A₂, A₃, A₅) – (A₅, A₆)

Exercise 5.14

(A, B, C, D), (A, B, D, E), (B, C, D, F)
(A, B, D, E) - (A, B, C, D - (B, C, D, F)

Exercise 5.15

The cliques are (A, B, C, D), (D, F, H), (D, I), (D, J), (C, E), (E, G, K). This is exactly the domains of the potentials of the non-top variables Link (A, B, C, D) with (D, F, H), (D, I), (D, J) and (C, E). Link (C, E) with (E, G, K).

Exercise 5.17

Take the order by collecting to (A, B, C, D) and afterwards distributing from it. In the collect phase all messages are unit potentials (1) except: (D, I) transmits P(i| D) and (C, E) transmits P(e| C) to (A, B, C, D). In the distribute phase only (A, B, C, D) transmits non-constant potentials: {å _{A, B , C}P(D| A, B, C)P(A)P(B)P(C), P(i| D)} to (F, D, H) and to (D, J), å _{A, B , C}P(D| A, B, C)P(A)P(B)P(C) to (D, I) (which is actually pointless), and {å _{A, B , D}P(D| A, B, C)P(A)P(B)P(i| D), P(C)} to (C, E).

Exercise 5.18

Links: (A, B, C)–[ C]-(C, D, E), (C, D, E)-[D, E]-(D, E, G), (C, D, E)-[C, E]-(C, E, F), (C,E , F)-[E, F]-(E, F, H)

Exercise 5.20

Let R be the first node to receive all of its messages. Then all messages passed so far must have been sent in the direction of R (Suppose not, and assume that V has sent a message away from R. At that instance V must have received a message from the R-direction. When R has received all messages, it must have received a message from the V-direction. This means that prior to this, V has sent a message in the R-direction, and at that time V has received all of its messages). Hence the transmission until R has received all its messages corresponds to a CollectEvidence to R. The remaining messages correspond to a DistributeEvidence from R.

Exercise 5.21

Add a fill-in (A₁, A₄) or (A₂, A₆). The is first the result of the elimination sequence A₃, A₅, A₇, A₈, A₂, and the latter is the result of the sequence A₃, A₅, A₇, A₈, A₁, …

Exercise 5.22

.
The elimination sequence is F, J, B, D, A, E, C, …

Exercise 5.23 A better elimination sequence is A, C, H, I, J, D, E, …

Chapter 6

Exercise 6.3

As Hugin does not support variable propagation we have to perform the calculation differently. Without "thinking" you can as follows: add an offspring variable for each possible couple, propagate and read the probabilities. See the Hugin network.
Sum-propagation yields that the following horses have a larger probability of being carrier than pure: Ann, Dorothy, Gwen, Henry, Irene. Max-propagation yields a configuration with the following carriers: Brian, Dorothy, Eric, Henry, Irene
The probabilities are fractional expressions of polynomial over l in high degrees, and we discard the question
Note that the question has been changed: e_J: John is sick, e_A: Ann is pure, e_B: Brian is pure, e_C: Cecily is a carrier. We get conf(e) = 1.34, and further analysis show that e_J is in conflict with {e_A, e_B, e_C}. The probabilities for the variables L, Fred, Eric, Henry and Irene jump much, and L, Fred and Henry can "explain" the conflict. It takes more to conclude that a bad gene has been imported through L.

Exercise 6.4

Using max-prop we see that the most probable word transmitted is baaba.

Exercise 6.5 The parameter t can be modelled explicitly as shown in the Hugin network, where the two values 0 and 1 are given prior probabilities ½. By entering the evidence and 0 we get P(e, 0) = 0.00027059. As P(0) = ½ we have P(e)(0) = P(e| 0) = 0.000540118. We get P(Besthand=myhand| e)(0) = 0.276. By entering the evidence 1 we get P(e)(1) = 0.0055625 and P(Besthand=myhand| e)(1) = 0.9297.

Using P(e)(0) = b and P(e)(1) = a +b we determine P(e)(t) = 0.0050224t+0.000540118

Using P(Besthand=myhand| e)P(e) = P(Besthand=myhand, e) we also determine P(Besthand=myhand, e)(t) = 0.0010332t+0.00050211.

We solve the equation 0.67 = P(Besthand=myhand, e)(t)/P(e)(t) and get t = 0.128