Undoubtedly the release number will continue to increment as I find and fix little bugs in my work on the AI improvements branch. Either way, v0.4.2.x is now the stable branch, carrying with it two major features: variations in replays, and some additions to the stable of rules.
First, variations: the headline feature for v0.4.x, variations provide players the ability to play out alternate histories in replay mode. These alternate histories can be saved and loaded, as well as annotated (and, annotations now have a handy new in-game editor for ease of production).
Second, rules: I finally got around to implementing the last two rules preventing OpenTafl from playing almost all known tafl variants. Not coincidentally, OpenTafl now supports every rule supported by PlayTaflOnline.com, so the bot player there can compete on every front. Poorly.
That brings me to my last point for today: AI improvements. What’s on the list? Well, I have a few things on my list.
First thing’s first: I have to characterize OpenTafl’s particular failings. The ones I can most easily fix rest in the evaluation function. If the evaluation function provides inaccurate evaluations of board states, then all the rest—heuristics and fancy pruning alike—rests on an unsteady foundation. The analysis engine feature aids me in this quest. My workflow for this initial phase goes like this: I play a game against the AI, using the analysis engine to feed me the AI’s moves. When the AI makes a move which is obviously bad, I go to the replay, then tell the analysis engine to dump its evaluations for the states in question. By comparing those evaluations with evaluations of the move I would prefer, I can begin to see what the AI sees.
I’ve already made two interesting discoveries regarding AI weights and preferences. First, it places far too much importance on guarding or attacking the king, to the point that the attacker AI will happily sacrifice piece after piece if only it puts the king in ‘check’. Second, flowing out of the first item, it has a badly inaccurate view of what constitutes threatening an opposing piece. When it decides to threaten an enemy piece, it will happily do so by moving one of its own pieces into a position where it can immediately be recaptured. Oops.
So, once I’ve made changes to fix those mistakes, how do I verify that I’ve made a positive difference? I play the AI against old versions. Thankfully, building older versions of OpenTafl is trivial. (If you’ve been following development, you’ll no doubt have noticed that there’s a Mercurial tag for every release.) I have a little tool which is intended to do Elo ratings for chess clubs, but which will serve to do Elo ratings for OpenTafl versions just fine. This helps quantify not only whether a version is better than another version, but by how much.
Once I have the evaluation function a little more settled, I can move onto some extra heuristics. I have two in mind for the moment: the killer move heuristic, and the history heuristic (or some related heuristic). The killer move heuristic is the more promising one. It assumes that most moves don’t change the overall state of the board too much, and there are likely to be, at a given depth in the tree, only a few plausible moves to push the evaluation in your direction. Therefore, if any of those moves are possible at that depth, the AI should try them first.
The history heuristic is more complicated. There are three variations I’m considering. First, the straight history heuristic, which orders moves by how often they’ve caused a cutoff. This prefers, in the long run, making moves which have been good elsewhere in the tree. Straightforward, compared to the others.
Second, the butterfly heuristic, which orders moves by how often they occur anywhere in the tree. This prefers making moves which are frequently considered. This one is a little more subtle. Alpha-beta search, by its very nature, ends up searching only the most interesting moves, pruning away the rest. The butterfly heuristic, by tracking moves which turn up a lot, is essentially tracking which moves are more interesting.
Finally, the countermove heuristic, which tracks which moves are the best responses to other moves, and weights moves matching the known good countermove more heavily. This one requires very little explanation.
So, in the next month or two, I expect the strength of OpenTafl’s AI to improve considerably. Stay tuned!
