On tafl: time use strategy

I wrote last month on game clocks for tafl games, and settled on something like the go game clock: main time, plus replenishing overtimes. You can find my reasoning at that link, so I won’t go into it here, rather taking the opportunity to chat about how to use the time that game clock gives you.

In undertaking any previously un-undertaken task in AI development, this question usually has an instructive answer: “How does chess do it?” Unfortunately, this case is one of the ones where the answer is a little less instructive. A perfectly functional, if not altogether optimal, time usage strategy for a chess AI is to assume that, however many moves the game has taken so far, finishing it will take 20 to 40 more moves. The AI should therefore use between 1/20 and 1/40 of the time remaining. Simple, and surprisingly effective: a chess game gets less complex over time, as pieces come off of the board.

No such luck with tafl games, though. A tafl game starts at moderate complexity, then grows significantly in complexity into the middle game, before tailing off again as the action focuses on a corner and as captures are made1. So, for fixed-time games, we want to spend a moderate amount of time on the early game, the bulk of the time in the middle game, and a decreasing amount of time in the endgame, conserving it as long as possible. For fixed-time games, OpenTafl, in its current incarnation, assumes that it’ll want to make a certain number of opening game moves (3, 5, or 10, for board sizes 7, 9, and 11), a certain number of midgame moves (6, 10, or 20), and an indeterminate number of endgame moves after the midgame. It dedicates a fixed amount of time (at present, 5%, but probably likely to increase) to finish all of its opening moves, a much longer amount of time (presently 75%) to the midgame moves, and a constantly decreasing amount of time to its endgame moves, with the remaining 20%. This seems functional enough in play with me, but the numbers will obviously be tweaked once I finish automatic self-play. (A feature which will be in 0.2.x, I’m pretty sure, given how handy it is for AI developers.)

Four hundred words in, though, and I haven’t yet covered the topic I said I wanted to cover to begin with: how do we handle overtime timing? It turns out that it’s simpler than fixed time, even: take the number of midgame moves above, use the main time for them, and use overtimes for all remaining moves. This, too, may be non-ideal—I could see some benefit to using the fixed-time framework for early game and midgame, and falling back on the overtimes only for endgame moves2.

There are some extra features I hope to add at some point: a little bit of history awareness, so that, whenever OpenTafl sees a wild swing in expected value for the current state, it can dedicate extra time to a deeper search. Another useful addition would probably be some slightly more adaptive time usage generally, with the ability to understand when the game has moved from the early game to the midgame, so that it can adjust its thinking time based on the state of the board, rather than fixed numbers of moves. That said, it’s heartening to see that my system for main time plus overtime time usage is basically what AlphaGo used in its games against Lee Sedol3.

Finally, and a little tangentially, I want to talk about OpenTafl’s success against me on various time controls. Now, I am a poor to average tafl player at best, so you should not take this paragraph to mean that I think OpenTafl has much of a chance against anyone good. Keep that caveat in mind. OpenTafl generally loses to me, and loses pretty severely, on longer time controls: this is a result of its flawed evaluation function4, which, along with alpha-beta pruning, frequently gets rid of branches that are ultimately better than the one it chooses to explore most deeply. The matches are closer with blitz timing: a 60-second brandub game proved to be pretty intense, and I only won with six seconds left on the clock. The shorter time control plays to the computer’s strengths: exhaustive reading three or four plies out, and it gets much closer to me in skill (in part because I can’t read exhaustively in such a small amount of time).

Anyway. That’s your inside look at the most recent major OpenTafl feature. Stay tuned for more in the coming weeks.

1. I should note that by ‘complexity’ here I’m referring mainly to branching factor, the quantitative measure, mixed with a little bit of my sense for what proportion of moves are worth considering, a much more qualitative one, and, for that matter, a much more unreliable one.
2. Although it may strike you that 30 moves for an 11×11 game seems a bit light, consider that it’s actually 30 turns, or 60 moves total: for most 11×11 games, the game has either ended by then, or we’re well into the midgame or late game.
3. I got sleepy just thinking about being up that late. It is my sincere hope that when DeepMind announces its match against Ke Jie or whoever’s next, that they agree to play in London. Or maybe New York. Somewhere closer to Eastern Time, please.
4. A topic I intend to go into deeper detail on in a later post, when I’m working on some cleanup and improvement.

Leave a Reply