I think this is doable. Say we assign a win rate W(S) to each board state S, and let W(S, A) denote the win rate after taking action A from state S. Since the transition is probabilistic, we can write:
max() is troublesome, but we can replace it with a >= sign:
W(S) >= (W(S, A), forall A in Actions)
And then we can expand W(S, A) and move W(S) in each of the inequalities to the left hand side. After all that we will have a bunch of linear inequalities which can be optimized with linear programming. I think the objective would just be:
Yep, this is what I ended up doing as well! With how the game generate boards, the player that goes first always have a ~5% advantage. Since players switch hands each around they should have 50% win rate if both play optimally.
In practice, playing against author's AI I barely get ~60% win rate (small caveat, I count ties as 0.5 to both players). What about yours?
I think you have an error in the equation defining V(s).
You have component n_c * V(s) for the 'nothing happened' case, but I don't think that's correct. If you rolled that nothing happens the turn still passes to your opponent, so I think it should be n_c * V'(s).
W(S, A) = P(good) * (1 - W(S_good)) + P(bad) * (1 - W(S_bad)) + (1 - P(good) - P(bad)) * (1 - W(S))
And obvisouly:
W(S) = max(W(S, A), foreach A in Actions)
max() is troublesome, but we can replace it with a >= sign:
W(S) >= (W(S, A), forall A in Actions)
And then we can expand W(S, A) and move W(S) in each of the inequalities to the left hand side. After all that we will have a bunch of linear inequalities which can be optimized with linear programming. I think the objective would just be:
<del>maximize</del> minimize W(empty_board)