Tag Archives: coding

On tafl: OpenTafl Notation

I mentioned in a previous post that one of the practical difficulties involved in computer implementations of tafl is the lack of a support infrastructure: anyone desiring to write a tafl AI must also write a tafl engine. Admittedly, this probably isn’t a significant obstacle to anyone with a real desire to write a tafl AI, but it does yield a great deal of duplication of effort. This is why my next step with OpenTafl is to build it into an engine into which other AIs can be plugged, and my eventual aim, hopefully at the end of this year, is to host the world’s first computer tafl tournament, where all the AIs so far written can compete against one another for bragging rights1. The timing depends mainly on how quickly I can write the features I need, but I hope to run the 1st Many Words Computer Tafl Open in late December2.

To do this, though, three sorts of notation are required. First, a method for communicating positions; second, a method for communicating rules; and third, a method for communicating moves. (A method for communicating moves already exists, thanks to tafl historian Damian Walker, but I propose some enhancements to it later.) The full version of my proposal is included at the end of this post.

First off, a method for positions: I propose something akin to chess’s Forsythe-Edwards notation, which is easily machine-readable, but also human-readable, and easy to compose by hand. Here’s an example, representing the brandub starting position:

/3t3/3t3/3T3/ttTKTtt/3T3/3t3/3t3/

Enclosed by slashes are rows. Lowercase letters represent pieces from the attacking side. Uppercase letters represent pieces from the defending side. (Besides t: taflman and k: king, my proposal also includes n: knight and c: commander, for Berserk-variant tafl.) Numbers represent that many empty spaces.

To communicate rules, there’s very little choice but to do so exhaustively. Clever use of defaults can somewhat alleviate the issue. Here is a proposed OTN rules string for Fetlar tafl:

dim:11 atkf:n start:/3ttttt3/5t5/11/t4T4t/t3TTT3t/tt1TTKTT1tt/t3TTT3t/t4T4t/11/5t5/3ttttt3/

It uses the dimension element (dim:11) and the start element (using an OTN position string) to mark the start and end of the rules; other elements come in between. The defaults suggested in the proposal, with the exception of atkf (attackers move first), are the rules of Fetlar, and other tafl games may be constructed as transformations from Fetlar. For instance, here’s brandub again:

dim:7 ks:n cenhe:n cenh: cenp:tcnkTCNK start:/3t3/3t3/3T3/ttTKTtt/3T3/3t3/3t3/

Brandub is Fetlar, except it’s played on a 7×7 board (dim:7), the king is not strong (ks:n), the center does not become hostile to the defenders when empty (cenhe:n), the center is not hostile to anybody (cenh:), and pieces of all types may pass over the center space (cenp:tcnkTCNK). Here’s Sea Battle 9×9:

dim:9 esc:e ka:n cenh: cenhe:n corh: cenp:tcnkTCNK corp:tcnkTCNK cens:tcnkTCNK cors:tcnkTCNK start:/3ttt3/4t4/4T4/t3T3t/ttTTKTTtt/t3T3t/4T4/4t4/3ttt3/

Sea Battle is Fetlar, except it’s played on a 9×9 board, escape is to the edge (esc:e), the king is not armed (ka:n), the center and corners are hostile to nobody, all pieces may move through the center and the corners, and all pieces may stop on the center and the corners.

Of tafl variants of which I am aware, the options in the rules string are sufficient to support all of them but one very esoteric alea evangelii variant3, and OpenTafl can theoretically support any variant constructed using OTN rules strings4. Other notable capabilities include the ability to define certain spaces as attacker fortresses, to support hostile attacker camps.

Finally, we move on to move notation. Building on Damian Walker’s aforementioned, excellent adaptation of algebraic chess notation to tafl rules (which, I should stress, is more than sufficient), I wanted to create a notation which requires a minimum of inference, and therefore slightly better suited to inter-computer communication. In effect, I wanted a notation sufficiently detailed such that a computer, operating with only a starting position and a list of moves, could accurately replay the game without having to know any of the rules. Since it is a notation with a lot of options, I provide a deeply contrived sample move record here, which uses almost every option:

Ne4-e6xf6 Ne6^=e8xe7/e9/cf8/d8-

The defending knight (uppercase N) on space e4 moves to e6, capturing the piece on f6 (having no special letter, it’s just an ordinary taflman). This is berserk tafl, so the move is not yet over. The knight on e6 jumps to e8 as its berserk move (^: jump, =: berserk), capturing the taflmen on e7, e9, and d8, and the enemy commander on f8 (as a special piece, it gets a prefix; as an attacking piece, it’s lowercase). This opens up a path for the king to reach the edge (-: king has a way to the edge).

You can read more, and peruse more examples, in the full proposal here: OpenTafl Notation. Feel free to leave a comment with your, well, comments.

1. And maybe a $20 Amazon gift card or something.
2. I hope to finalize the protocol in the next few months, and finish implementing it by the end of summer.
3. The one with mercenaries, pieces which start on the defending side but change sides when captured.
4. I haven’t yet written the code to parse them and generate new games, so I can’t say for sure. Thinking through some oddball variations, though, none strike me as OpenTafl-breaking.

On tafl: what next?

In my last tafl post, I remarked that I’m nearing the end of my current OpenTafl sprint. Now, it’s wrapped up. Here’s what I finished:

  1. Write about 70% of a new evaluation function. It’s designed to easily accommodate tweaks and modifications, which will undoubtedly be required as I learn more about the game and uncover positions that the current function evaluates incorrectly.
  2. Move OpenTafl into its own repository, so I can publish the source.
  3. Package and test OpenTafl for binary distribution.
  4. Rewrite exploration to generate and sort moves prior to generating states.

All three of these are done, and OpenTafl downloads and source releases can be found at the Many Words Softworks site for OpenTafl. Play and enjoy. For questions, comments, or bug reports, please leave a comment, or drop me an email. (My contact information is in the About section at the main page.)

Here’s what’s on the list for the next sprint:

  1. Support for easy position entry. This will require creating some sort of short-form positional notation, akin to FEN from the chess world.
  2. External AI support, a la several chess engines. This will require defining a protocol by which OpenTafl can communicate with an external AI, including communicating things like the rules of the variant in question.
  3. Partial time control, so that OpenTafl can be limited to a certain number of seconds per turn.
  4. Some search improvements and tweaks. The prior three features are important stepping stones to this one; they’ll let me test OpenTafl against itself by quick-play tournaments, and therefore verify what sort of impact my changes are having.

As you can see, the next sprint is less about the nitty-gritty AI research and more about foundational work in developing OpenTafl as an engine for future AI research—my own and others. Ultimately, that’s a more important task.

On tafl: more optimizations

Having spent quite a lot of time on another code project, it’s high time I got back to a little tafl.

Remember where we left off: board states represented as arrays of small integers, and taflman objects inflated as needed during game tree search. It turns out that inflating and deflating taflmen is also a potential source of slowness, so I went ahead and removed that code from the game. (It also greatly simplifies the process of moving from turn to turn.) Wherever I would pass around the old taflman object, I now pass around a character data type, the sixteen bits I described in the previous tafl post.

This revealed two further slowdowns I had to tackle. The first, and most obvious, was the algorithm for detecting encirclements. For some time, I’ve known I would need to fix it, but I didn’t have any ideas until a few weeks ago. Let’s start with the old way:


for each potentially-surrounded taflman:
  get all possible destinations for taflman
    for each destination:
    if destination is edge, return true
return false

Simple, but slow. In the worst-case scenario, each taflman has to consider every other space on the board, and look briefly at every other space from each rank and file (where n is the number of taflmen and b is the board dimension):

O(n \times b^2 \times 2b) = O(nb^3)

I was able to cut this down some by running an imperfect fast check: if, on any rank or file, the allegedly-surrounded side has the taflman nearest either edge, that side is not surrounded. That had implications for a better algorithm, and sure enough, I hit on one.


list spaces-to-explore := get-all-edge-spaces
list spaces-considered := []
for space in spaces-to-explore:
  add space to spaces-considered
  get neighbors for space not in spaces-to-explore and not in spaces-considered
  for each neighbor:
    if neighbor is occupied:
      if occupier belongs to surrounded-side:
        return false
      else continue
    else:
      add neighbor to spaces-to-explore
return true

More complicated to express in its entirety, but no more difficult to explain conceptually: flow in from the outside of the board. Taflmen belonging to the surrounding side stop the flow. If the flow reaches any taflman belonging to the allegedly surrounded side, then the allegedly surrounded side is, indeed, not surrounded. This evaluation runs in a much better amount of time:

O(b^2)

Astute readers might note that, since the board dimension is never any larger than 19 at the worst, the difference between a square and a cube in the worst-case runtime isn’t all that vast. Remember, though, that we need to check to see whether every generated node is an encirclement victory, since an encirclement victory is ephemeral—the next move might change it. Running a slightly less efficient search several hundred thousand to several million times is not good for performance.

I found another pain point in the new taflman representation. The new, array-based board representation is the canonical representation of a state, so to find a taflman in that state, the most obvious move is to look through the whole board for a taflman matching the right description. Same deal as with encirclement detection—a small difference in absolute speed makes for a huge difference in overall time over (in this case) a few tens of millions of invocations. I fixed this with a hash table, which maps the 16-bit representation of a taflman to its location on the board. The position table is updated as taflmen move, or are captured, and copied from state to state so that it need be created only once. Hashing unique numbers (as taflmen representations are) is altogether trivial, and Java’s standard HashMap type provides a method for accessing not only the value for a given key, but also a set of keys and a set of values. The set of keys is particularly useful. Since it’s a canonical list of the taflmen present in the game, it saves a good bit of time against looping over the whole of the board when I need to filter the list of taflmen in some way.

Those are all small potatoes, though, next to some major algorithmic improvements. When we last left OpenTafl, the search algorithm was straight minimax: that is, the attacking player is the ‘max’ player, looking for board states with positive evaluations, and the defending player is the ‘min’ player, looking for the opposite. Minimax searches the entire game tree.

Let’s look at an example, a tree of depth 3 with max to play and a branching factor of 3, for 9 leaf nodes. The leaf nodes have values of 7, 8, and 9; 4, 5, and 6; and 12, 2, and 3. We can express the minimax perspective on this tree mathematically:

max(min(7,8,9), min(4, 5, 6), min(12, 2, 3))

That is to say, max makes the move which gives min the least good best response. Remember, though, that we search the tree depth-first, which means that in the formula above, we move from left to right. This leaves us with some room for cleverness.

Consider this:

max(min(7,8,9), min(4, x, y), min(12, 2, z))

Does it matter what the numbers at x, y, and z are? No: by that time, we’ve already established that the subtrees in which they appear (with best possible values 4 and 2) are worse than the first subtree (with known value 7). Therefore, we can stop exploring those subtrees. This is known as ‘pruning’.

Pruning is good, because, when done correctly, it saves us work without dropping good moves. Alpha-beta pruning, the kind described above, is known to produce identical results to minimax, and, in the ideal case, can search twice as deep. (So far, it’s only given me one extra ply, though; the case isn’t quite ideal enough for more.) Other kinds of pruning may find their way in later, and will take some more careful thinking to avoid pruning away potentially good branches.

What is the ideal case, then? We want to explore the best moves first. A cursory examination of the example case above provides the reasoning. If we explore the subtrees in opposite order, we can do no pruning, and we get no speedup. This presents us with a chicken-and-egg problem: if we could easily and programmatically determine which moves are good, we wouldn’t need to bother with variants on full-tree search at all.

Fortunately, the structure of a tree gives us an easy out. Most of the nodes in any given tree are at the bottom. Consider a simple example, with a branching factor of two. At the first level, there’s one node; at the next, two nodes; at the third, four nodes; and at the fourth, eight nodes. In every case, there are more nodes at the deepest level than at all the levels above combined. The difference becomes more pronounced with greater branching factors: a tree with a branching factor of 30 (which approximates brandub) has a distribution of 1/30/900/27,000. So, it follows that the large majority of the work in a game tree is expanding the game states at the target search depth, and evaluating those states; the work higher up the tree is almost trivial.

Take that and add it to this claim: the best move at any given depth is likely to have been one of the best moves at the immediately prior depth, too. We can use those facts together to figure out the best ordering for our moves at our target depth: search to the depth prior, keep track of the best moves at that depth, and search them first. This approach is called ‘iterative deepening’: generally, you start at depth 1, and use results from previous levels to order each node’s children. OpenTafl accomplishes this with a structure I’m calling a deepening table, which is generated for each . The deepening table is a list of maps, one per depth. The maps relate a position hash to an evaluation done at its depth. On each deepening iteration, the exploration function generates successor states, then sorts them according to the data stored in the deepening table before exploring further. (There’s some easy performance on the table here—since creating a full-on successor state is a non-trivial operation, I can probably buy myself some easy speedup by generating moves alone, sorting those based on the positional hashes which would result from making those moves, and only generating states immediately before exploring them. This would probably save a good bit of time by never generating states which would have otherwise been cut off by alpha-beta pruning.)

Implementing the deepening tables put me on the scent of another major optimization, the transposition table. A transposition table stores values for previously-visited states, along with the depth each state was previously searched to. (If we’re searching six plies deep and previously encountered a state at depth four, that state was only searched to a depth of two. If we encounter the same state at depth two, then if we use the cached value, we’ve left two plies of depth on the table.) Hits in the transposition table early on in the search save us expanding whole subtrees, which yields many fewer node expansions required per search, and lets us reach deeper more easily.

The transposition table, since it persists through the entire game, must be limited in size by some mechanism. OpenTafl’s transposition table is an array of Entry objects, with a fixed size in megabytes. The array is indexed by positional hashes, modulo the size of the table. When indices collide, OpenTafl decides whether to replace the old value based on its age (older values are more readily discarded) and the depth to which it is searched (deeper searches are kept, if both are equally as fresh).

That brings me to the end of another marathon tafl post. Stay tuned for another one—probably a much shorter one—in the not-too-distant future, as I wrap up work on this sprint of OpenTafl, with an eye toward the first public code and source release.

Some anti-patterns

Another college pal, a fellow traveler in the world of the computer sciences, wrote up this list for me. Call him Shenmage, and encourage him to set up the contributor’s account I made him here, the lazy bum. -Fish

As a software engineer, I tend to come across code bases of various levels of quality. To make light of some of the oddities we encountered, myself and some of my co-workers started writing up the logical conclusions of the code structures. Each of these was encountered to some degree or another (though not necessarily as anything beyond a single method). Enjoy, and feel free to add your own to the comments section.

Walmart Pattern

  • Has everything you could ever want
  • You can’t find anything
  • Sometimes have to get multiple of something when you only want one

Sam’s Club Pattern

  • See Walmart Pattern
  • Can only get things in bulk (no single entities)

Titanic Pattern

  • Everything is well structured and coded to the dot
  • Only has a manual process for recovering the system (with potentially catastrophic consequences)

Power of Attorney Pattern

  • Pass SQL commands directly to a web service to get executed

Lottery Pattern

  • Retrievals are randomly generated
  • Only occasionally get what you want

Tardis Pattern

  • Alters history of an object
  • Does more actions than requested
  • Doesn’t do anything exactly as you request

Leaning Tower of Pisa Pattern

  • Perfectly structured but tightly coupled to outdated tech

Starbucks Pattern

  • More identical web services than you need
  • Expensive computationally

Monopoly Pattern

  • Multithreaded, but one thread eventually eats up all available resources

Glass House Pattern

  • Security on a system was completely ignored on critical components

Magic 8 Ball Pattern

  • Calls return only boolean values
  • ‘Maybe’ is included as a boolean value
  • Multiple ways to say each value

Speakeasy Pattern

  • Webservice is completely undocumented, so have to know precisely what to send where to use it

Narcissus Pattern

  • Class depends upon itself and uses itself to accomplish tasks

Lazy Inspector Pattern

  • Methods that should do tasks instead return true

Scorched Earth Pattern

  • Update method drops and recreates table

“What’s in the box?!” Pattern

  • Large, untyped, container objects are the only objects passed around the system

OCD ORM Pattern

  • Verifies all retrievals by retrieving again

On tafl: game tree search optimizations

As you’ll recall from the tafl piece about complexity, tafl has a high branching factor compared to, say, chess1. That means that a tafl game tree is broad: it has a lot of nodes relative to its depth. Since searching deep into a broad tree is difficult, getting good lookahead out of a tafl engine is going to be difficult. And, to be clear on where this post is going, it started off being extremely difficult.

Before I get there, though, I want to talk about the state of the art in chess AI as of now, or at least a few years ago. (Recent sources are probably easy to find if I were to look in the right places, but I forgot I still have a .edu email address with which I can probably sneak into a bunch of online article repositories.) Chess engines, today, on a moderately powerful desktop computer, are able to generate about four million game states per second. They also have highly advanced pruning2: not only the standard sort I’ll go into in a later post, but also a more qualitative sort, discarding moves that pass the plausibility test but fail the usefulness test.

Four million states per second is an extremely high bar to hit. Chess engines have been in development since the dawn of digital computing, just about, and fifty to sixty years of optimization really shows3. I told myself I’d be happy with somewhere between ten and twenty-five percent of chess’s state of the art.

How far did I have to go? Well, I coded up my state explorer, put a bit of benchmarking code in, and set it exploring three ply deep in brandub. It chugged along for a while, and then, almost forty seconds later, finished. The number? 1,000 states per second, which comes to between four hundred and one thousand times too slow.

Ouch.

I quickly discovered a few obvious things I could fix. First, I was doing an expensive operation (getting the allowable moves for a piece) for every piece on each side, to check if either side had no allowable moves remaining. I replaced that with a short-circuit check that returns true as soon as it discovers any allowable move for either side, which is a much quicker proposition. Second, I was checking for edge forts at the end of every turn, whether or not the game allowed for edge forts. I put a stop to that. Third, I was checking for edge forts in a vastly more expensive way than I needed to, to allow for a situation that does not occur in any known tafl variant4.

Confidently, I ran the benchmark again: 4,000 states per second.

Well, I’d been hoping for more than that.

I did a little reading, and decided some of my problem was with the programming language I’d selected. My language of choice for most of my side projects is a scripting language called Groovy. It’s Java’s little brother, sort of; it provides a bunch of shortcuts Java does not for displaying things to the terminal, handling user inputs, playing around with files, and iterating over sets of objects. I suspected from the start that Groovy might cause me headaches, and profiling suggested that was correct—my code was spending most of its execution time in Groovy’s internals. Rewriting it in Java was quick, as such things go, and with that squared away, and me optimistic as to my chances, I set up another benchmark.

15,000 states per second.

Three times faster is great, but I still had a few orders of magnitude to go.

I decided that the board representation I had started with was clever, but inefficient. My profiler suggested I was using literally tens of millions of objects: each space was a first-class object, in the first reckoning. This would not do; objects in Java may be small, and instantiating them may be fast, but there’s a limit to what a Java virtual machine can be expected to put up with. I removed the class of object which represented spaces, and set about rewriting the code to reference an array of small integers to represent a board.

Moving to an integer representation meant that I had an opportunity to come up with an encoding I can eventually use to rid myself of helper objects altogether. (As I write this, I’m still using a class of objects to represent taflmen—they’re inflated from the integer representation when needed, and compressed when the state is no longer active during a game tree search.) The first six bits of the integer are used for an ID number—each taflman-representation has a unique ID within its side. Six bits lets me define up to 127 taflmen per side, which is sufficient for every known variant, and leaves some room for busier alea evangelii variants, if desired. The next three bits define the piece’s type. I’m aware of five potential types of taflman: a standard taflman, a knight or a commander from berserker, a mercenary (a defending piece which switches sides when captured, seen in one proposed alea evangelii variant), and a king. Finally, the next bit defines whether the piece is a besieger or a defender.

I was a little put out to realize that my efforts did not actually buy me very much—I was still at about 15,000 states per second.

My next task was to observe the behavior of my code, and find out where it was performing poorly. An unexpected but obvious optimization presented itself. When you run a Java program, it defaults to using a small amount of memory. You can provide command line switches to allow it to use a much larger amount of memory. Doing that, I found I was up to 50,000 states per second—that’s getting somewhere.

In fact, that’s what I’d call having gotten somewhere. 50,000 states per second is sufficient to play a game of brandub to a search depth of 3, without undue waiting. (It takes about 40,000 states explored to get to depth 3, and about 1.2 million to depth 4.) I whipped up a quick evaluation function (which I’ll describe in a subsequent post), hooked it up, and quickly found that it played only slightly more poorly than I do5.

With that heartening development, I also stumbled upon a happy fact. Java is, at first, an interpreted language, but it compiles sections of programs to faster, native code when it discovers that it runs into them a lot. After the computer took its first turn, I saw the benchmark report that it had explored 200,000 states per second. 200,000 states per second I can live with—my laptop is several years old and a laptop, and in testing, my desktop was about five times faster in an earlier version of the code. I intend to test it again soon, and see what a powerful machine can do with it.

I made one final optimization after that. Since the AI code used a thick representation of game states, it was extremely memory-hungry—brandub to depth 3 took over four gigabytes of memory, and other games were impossible altogether. I solved that one with two changes: first, I now represent the explored nodes of a tree as a list of moves required to reach each node from the root. Second, I trigger garbage collection manually.

Now, this may seem like a bad idea, to those of you with some Java development experience, but in this case, it’s called for. The default Java garbage collection scheme is two-leveled: objects just created go in a group called the Eden space. The Eden space contains objects whose longevity is not known, and which will therefore likely be quickly removed. If an object survives garbage collection in the Eden space, it moves into the Survivor space, where middle-longevity objects live. The Survivor space is touched less frequently by garbage collection. Finally, if an object lives in the Survivor space through a couple of garbage collections, it moves into the Tenured space, where it is troubled by garbage collections least frequently of all.

A lot of my objects end up in the Tenured generation—the objects which represent boards, particularly. They’re long-lived and durable, while almost every other object in the code comes and goes somewhat quickly. That means that Board objects, which are rather large, and which also contain references to other, shorter-lived objects, stay in memory more often than I’d like, and keep their sub-objects in memory, too. Triggering a garbage collection manually garbage collects everything, regardless of the space in which it lives, so the Board objects are cleaned up, freeing space. It does slow the code down somewhat: I get about 120,000 states per second on my laptop now, under ideal conditions.

Like I said, I can live with that. It’s not where I want it to be, but it’ll do for the time being. Coming up in the not-too-distant future: writing on evaluation function, although I have some research to do before I’m done with that.

1. Go starts out branchier than 11×11 tafl variants, obviously, since you can play a stone at any point on the board. Thinking in terms of smart opening moves, I suspect the order still goes go, 11×11 tafl, chess. Large (that is, 19×19) tafl variants probably branch more than go at almost every stage of the game; the board is the same size.
2. That is, trimming unproductive branches from the game tree before expending the effort to explore them.
3. Here’s one for you. Computer memory is organized into bytes: eight slots to hold either 0 or 1 (called ‘bits’). Some data types are eight bytes long, or 64 bits. The chess board is 64 spaces, so someone had the idea to represent chess boards in a single eight-byte long integer. You need a few of these bitboards to represent a whole chess game state, obviously: minimally, two to represent which side controls each space, and six to represent which kind of piece occupies each space. Even so, it’s very compact, and bitwise operations on 64 bits at a time are blazingly quick on today’s 64-bit CPUs.
Unfortunately, this does not generalize well to tafl; most of the boards we concern ourselves with are larger than 64 spaces.
4. Namely, allowing black, using Berserker-style pieces, to jump into a Copenhagen edge fort to break it.
5. This says a lot more about me than about the AI.

On tafl: AI basics

My last tafl post was deeply theoretical. Very shortly, this one is going to take a turn for the deeply practical. First off, though, there’s some theory housekeeping I need to handle. Because the Copenhagen and Fetlar variants are both difficult AI problems, and because Copenhagen has some additional rules which are computationally hard to check, I decided to start with brandub, a much simpler tafl variant. Brandub comes from Ireland, and is played on a 7×7 board. The king’s side has five pieces, and the besieging side has eight, arranged in a cross. The king participates in captures, and is captured by two besieging pieces instead of four. He escapes to the corners. Any piece may cross the center square, but only the king stops on it.

Computationally, brandub is in a much easier class than the 11×11 variants I’ve discussed. The upper bound on its state space complexity is a mere 4.4 x 1015, using the estimation technique I described in the previous post; for comparison, checkers1 is on the order of 1020. Its game tree complexity is at least 1.3 x 1036, which is also not overwhelmingly distant from checkers’ complexity (though harder—tafl’s state space complexity is frequently lower than comparable games, but its game tree complexity is frequently higher). This makes it a good place to start—for one, it means that it doesn’t matter how slow my code runs.

So, to kick things off, I started with what the business calls GOFAI—good old fashioned AI—techniques. The simple version is this: you play every possible move from your current position to a few moves out, and pick the best outcome. It ends up being a little more complex than that, though. If you’re not curious, you can skip the next few paragraphs.

The particular GOFAI technique I’m using right now is a straight implementation of the minimax algorithm. After generating a game tree from the current state to a given search depth, the algorithm evaluates the tree’s leaf nodes. An evaluation is a single number, describing whether the state is good for the besieging side (nearer positive infinity) or good for the defending side (nearer negative infinity). The details aren’t important just yet, although I’ll get into them later. Anyway, the algorithm is predicated on the assumption that both players are perfect—they’ll always make the best move they can make, given the situation. It follows, therefore, that the best move at a certain depth is the one which leads to the best state for him: the most negative for the defender, the most positive for the besieger.

As an example, with the besieger to move and a search depth of three, there are four levels to the tree2:

1. A position with the besieger to move.
2. All of the positions, defender to move, resulting from every one of the besieger’s possible moves.
3. All of the positions, besieger to move, resulting from the defender’s possible responses to those moves.
4. All of the positions, defender to move, resulting from the besieger’s responses to the defender’s responses.

The algorithm starts at the tree’s leaves, the nodes at the fourth level above, and runs the evaluation function. It then moves up a level. The besieger is to move, and all of his possible moves have a child state which has an evaluation attached to them. The besieger picks the move which yields the best outcome for him, which is also the worst outcome for the defender. (Although the tafl board is asymmetric, the game, in the theoretical sense, is not—a good state for the defending side is a bad state for the besieging side, and vice versa.) The value of each node at the third level becomes the value of its best child (for the besieger). The algorithm then moves up to the second level. The defender is to move, and he has a set of moves and child states with evaluations attached, as we just derived. He picks the best outcome for himself, which is also the worst outcome for the besieger. The value of each node at the second level becomes the value of its best child (for the defender). The algorithm then moves up a level, and, at the first level, the besieger has a list of moves and child states with evaluations attached to them. He chooses the best move out of his options, and that’s the move he makes.

As you might have guessed, if you had a computer of infinite power, you could just explore the game tree until you found a win, loss, or draw for every sequence of moves from your starting state. In fact, we call games for which this is possible3 solved. The evaluation function for an infinitely powerful computer is extremely simple:

1. Has the besieger won in this state? Return positive infinity.
2. Has the defender won in this state? Return negative infinity.
3. Has neither of the above happened? Return zero.
4. Ignore draws for the purposes of this example, or declare them a win for one side, or something.

The algorithm will explore until every branch reaches an end state, then work its way up through the tree. Since it assumes perfect play, it will provide a sequence of moves which yields a victory for one side from the current position, and there’s nothing the other side can do about it—the whole game becomes inevitable. Perhaps fortunately, this is extremely hard for most games; tic-tac-toe is somewhat less thrilling now that it’s solved.

It also reveals a weakness in, or at least a caveat with, the minimax algorithm: since we do not have computers of infinite power (in fact, my current implementation is able to look ahead only one ply further than the example I gave above), its playing ability depends either on achieving deep lookahead to compete with humans, who can look ahead much more selectively than computers can, or on having a good evaluation function. I haven’t really attained either goal yet.

I see that I’ve already written about a thousand words, and I haven’t even made it out of the first point on my supposed outline for this post. I suppose I’ll just have to write a couple of ’em. Next time I’ll get into the practical considerations in implementing an AI algorithm like this. Probably the time after that, I’ll write about evaluation functions, and perhaps after that I’ll write about some improved algorithms. The latter requires that I actually implement them, though, so we’ll see how the timing works out.

1. 8×8 draughts, if you aren’t an American.
2. The term of art is ‘plies’, singular ‘ply’—since a response to a move is called a reply, clearly a single move must just be a ply. (Math and computer people, am I right?)
3. Wikipedia has some interesting reading on what it actually means for a game to be solved—the definition I’ve given is what’s referred to as a ‘strong’ solution, where a computer has brute-forced the entire game tree. There are two other kinds of solved games: those solved ‘ultra-weakly’, which is to say, by abstract mathematical reasoning, and those solved ‘weakly’, by some algorithm which does not explore the entire tree.

On tafl: state space complexity

One of the most basic ways to describe the computational complexity of a game is its state space complexity—that is, the number of unique positions possible for a game—and that’s where I’ve started in my analysis of computer tafl. For now, and probably for every post I write on tafl and mathematics, I’m going to stick with Fetlar rules tafl or Copenhagen rules tafl1, on an 11×11 board.

To start off, we can make an extremely, extremely rough estimate. The board has 121 spaces, and each space can be empty, occupied by a besieger, occupied by a defender, or occupied by the king, so to get an absolute upper bound, we can just do this:

4^{121} \approx 7\times10^{72}

7 x 1072 is surprisingly close to the rough estimate for chess (1.9 x 1071; chess has thirteen states per space and 64 spaces). We know the actual state space complexity of chess to be in the vicinity of 1047, however, so we can clearly do better. And, if you think about it, we’re not even trying all that hard. Our current estimate permits some flagrantly illegal positions, like a board entirely filled with kings, or the board with no taflmen.

There are plenty of obvious constraints we can use to tighten up our guess. For one, any reachable tafl position has to have at least three taflmen: one side must have two taflmen, to have captured the second-to-last piece from the other side, and the other side must have at least one taflman left, or else the game is over. For another, there are only 37 taflmen to place.

An intermediate question presents itself: in how many different ways can three to 37 taflmen be placed on 121 spaces? Well, that’s a question for combinatorics. For any number n of taflmen, the number of unique arrangements of those taflmen into 121 spaces is:

\binom{121}{n}

And, since I’m a programmer, and loops come naturally to me…

\displaystyle\sum_{i=3}^{37} \binom{121}{i} \approx 3\times10^{31}

So, we have 3 x 1031 ways to arrange our taflmen onto our board. Going forward, I’ll multiply it by one quarter2, although it hardly matters at this scale3. This picture is also incomplete—it’s not an upper bound, because it doesn’t account for taflmen belonging to the besieging or defending sides. For every arrangement of n taflmen on the board, there are a bunch of distributions of the taflmen between the besieging and defending sides, and a bunch of options for the position of the king within the defending side.

So, since I am still a programmer, time to break out the nested sums. For every number of taflmen i between three and 37, consider the distribution of taflmen between the sides. For valid positions, let k be the number of pieces on the defending side, between one and i – 1. We need only consider one side. From a logic perspective, it’s obvious why: answering the question, “How many ways are there to assign k taflmen to the defending side?” contains in its answer the answer to the question, “How many ways are there to assign i – k taflmen to the besieging side?” Since the two sides exist on the same board, the number of defending positions is the number of attacking positions. (There’s a mathematical argument in a footnote4, if you don’t buy this.)

Anyway, that insight gets us most of the way there. We have one more thing to account for: the position of the king. One of the defending taflmen must be the king for this position to be valid, and the king is distinct from every other defending taflman. Any one of the k defending taflmen could be the king: that is, there are always exactly k distinct arrangements of the defending taflmen for any distribution of taflmen between the sides, so we multiply the two numbers together.

\displaystyle\sum_{i=3}^{37} \Bigg(\binom{121}{i} \times \bigg(\sum_{k=1}^{i-1} \binom{i}{k} \times k\bigg)\Bigg) \\[1em] \approx 1.4\times10^{43}

There we have it: the number of possible arrangements of taflmen on the board for every valid number of taflmen, multiplied by the number of ways to split a certain number i of taflmen between each side, is a much better bound on the number of possible positions. It’s still not perfect: mainly, this model permits some positions where pieces have illegally stopped on restricted spaces, and I’m probably missing some other things, too, but it’s a close enough bound to make an interesting point.

11×11 tafl games have an upper bound on state space complexity on the order of 1043. Chess, as I mentioned earlier, has a state space complexity on the order of 1047. This might seem like evidence against my theory that tafl games are more computationally complex than chess, but I’m still convinced. Ultimately, state space complexity is a poor measurement of computational difficulty. Consider backgammon: its state space complexity is on the order of 1020, but backgammon AI is a more difficult problem than chess AI.

There is another metric to introduce: that of game tree complexity. State space complexity measures the number of possible positions for a given game. Game tree complexity measures, in a sense, the number of ways to get there. Consider a tree of every possible game of tafl. You start, obviously, with the starting position, then create a leaf node for every possible move by the defending player. At each of those leaves, create leaf nodes for every possible response for the defending player, and so on until every possible combination of moves has resulted in a game-ending state. (We’ll assume there are no ways for games to go on forever.) The number of leaves at the bottom of the tree is the game state complexity.

This number, Wikipedia remarks in its typical laconic style, is “hard even to estimate.” This is true. Wikipedia also remarks that a reasonable lower bound can be established by raising the branching factor, the average number of moves possible at any given step, to the power of the game length in plies (that is, one player moving once). This is where things start to get interesting.

The accepted figure for chess’s branching factor is 35. From the starting position, each player can make 20 distinct moves. (Pawns forward one or two squares, or knights to the inside or outside.) Chess games last, on average, about 40 turns, or 80 plies. The game tree complexity of chess is at least 10123. Backgammon’s branching factor is 250, and its average game length is about 60 plies, so its game tree complexity is at least 10144.

I don’t have a figure for tafl’s branching factor. Aage Nielsen’s tafl site has a tournament running right now, using the Copenhagen tafl rules. Once it’s finished, I’ll look at some positions from the midgame and the endgame and try to work out a good average. In the interests of providing a number, I tweaked OpenTafl’s debug script to print out a count of the number of possible moves on the opening turn. The besiegers can make 116 moves, and the defenders can make 60, for an average of 88. High-level tafl games seem to go on for about 40 turns or 80 plies, from my observation so far. This yields a lower bound of 10167, which rather blows chess out of the water.

Let me try to contextualize that number. Relatively simple 11×11 tafl games are, as far as I can tell, more computationally difficult than any historical abstract strategy game besides go. Not only that, but larger tafl games (15×15 or 19×19 variants) seem likely to me to be more difficult than go.

If you’re not well-read on the subject of board game artifical intelligences, you may have missed what a crazy thought that is. Go is the benchmark for hard AI problems. Computers were playing chess and backgammon at grandmaster levels by the end of the 1970s. Computers still can’t beat professional go players in even games, and the hard tafl variants are probably harder than that.

This is a seriously cool game.

1. Here are rules for the Fetlar and Copenhagen variants. If you don’t care to read them, here’s the summary: they’re more or less identical for my purposes here, played on an 11×11 board with 12 defenders, one king, and 24 attackers.
2. Think about it: any arrangement of taflmen on the board is functionally identical if you were to rotate the board 90 degrees beneath them in either direction, or 180 degrees.
3. Remember, with scientific notation, 8 x 105 divided by four is 2 x 105. Or, indeed, 8 x 100 divided by four is 2 x 100.
4. Let i be the number of taflmen on the board, k be the number of defending taflmen, and i – k be the number of besieging taflmen. The number of distributions of taflmen for the defenders is:

\binom{i}{k}

Therefore, k taflmen have already been accounted for—the besieging side can’t choose them, because they’ve been chosen as defending taflmen. We do not choose from i, we choose besieging taflmen from from i – k, which yields:

\binom{i - k}{i - k}

Which is one, and can be simplified out.

On tafl

While I attempt to dream up suitable procurement challenges for Parvusimperator, who is much harder to challenge than I am, I have some side projects in flight. One of those is a computer implementation of tafl.

Tafl is an Old Norse board game. The name means ‘table’, and the game and its variants are also known by some others: hnefatafl (King’s Table, to my best guess), brandubh, tablut, alea evangelii, ard ri, and likely some I’m forgetting. It is mainly notable for being the best-developed asymmetric abstract strategy game of which I am aware. Common across all the varians is the central goal: one side, the king’s side, starts in the center of the board. Its goal is to get its king to escape. The other side, the besieging side, plays to prevent the king from escaping. The king escapes on edge spaces in some variants, and on corner spaces in others.

Taflmen are moved like the rook in chess, orthogonally, any number of spaces. Captures are made by surrounding opposing taflmen on two sides, though some variants require that the king be surrounded by four opposing taflmen to be captured, and others do not allow the king to take part in captures.

Some modern variants introduce innovations aimed at a more balanced game, or one less likely to stalemate: pieces in a line on a board edge can be captured by lining up your own pieces opposite them and surrounding them to the sides, or the king can escape by means of a similar formation, if he is able to move inside the surrounding formation. Other modern variants go even further from the original rules, introducing pieces which can jump to move or jump to capture, and the ‘berserker rule’: a piece which makes a capture can move repeatedly, so long as each subsequent move also makes a capture. (That one is going to be a huge pain to implement. Just saying.)

Anyway, besides its obscurity and its Viking flavor, I find two other things to like about tafl games: first, the unchartedness of the territory. Tafl is still an enthusiast community, and although it seems to be growing, research into the game is still in its infancy. It’s a chance to break some new ground in human understanding, a compelling reason to work on it.

The second reason is the probable computational complexity of tafl games. Though the rules are simpler than those of chess, it’s my suspicion that, in terms of raw possible games and average branching factor, tafl is a harder game to model than chess. Consider: concretely, taflmen move as freely as the second tier of chess pieces (the rooks and bishops), and the tafl board is bigger (attested variants range from 7×7 to 17×17 or 19×19, depending on your interpretation of the rules). Qualitatively, it feels to me like tafl is a busier game: the capture rules mean that it’s more difficult to make captures while keeping strong positions, and the escape/surround duality in objectives means that material advantage is, to a certain point, less important. Pinning a piece in place so that a piece behind it can’t deliver an attack is fundamental to chess tactics, but two taflmen in the same vicinity build upon each other, and their interchangeability means that sacrificing one taflman so that another can move into a better position requires much less care than chess tactics do.

I suppose, speaking in the most general terms, chess between evenly-matched players is a game of materiel before a game of position: gains in material advantage are easily parlayed into gains in positional advantage, because it’s easier to fix an opponent’s powerful pieces in place relative to tafl. Tafl between evenly-matched players is a game of position before materiel. It isn’t uncommon to see high-level tafl players decline to take ‘freebie captures’—when the opponent places a piece into a position where it can be captured without retaliation—because a small material gain is not worth losing a turn in the race for position elsewhere on the board.

Anyway, that’s all I have for today. As I get OpenTafl more ready for a release, I intend to go into more specifics about its variations, its strategy, and my implementation of its more curious features and work toward a reasonable AI. I’ll see you then. (Or probably before then, when I tell you which things I’ve chosen for Parvusimperator’s unseemly gauntlet-throwing.)