Tag Archives: gaming

Fishbreath Plays: SimplePlanes

I’m a fan of sandboxes.

Many of my favorite games are sandboxes, or have a sandbox element: StarMade is altogether a sandbox, Rule the Waves gives you plenty of latitude to build your own navy and your own history, Falcon 4 lets you play with someone else’s castle or kick it down as you please, and Command Ops, though less open than the rest of this list, still gives you the chance to do largely as you please inside whatever scenarios you can find or make.

So, when I saw that SimplePlanes, an aeronautics sandbox by Jundroo, who made one of my favorite physics puzzle games on Android, was now on Steam, I had to give it a whirl. We’ll get the bad out of the way first: it’s a port of a mobile game, and so the interface is not as powerful as, say, Kerbal Space Program’s (which is the natural comparison), and the parts list isn’t quite as lengthy as I’d like. That said, the flight modeling is excellent for a wee building game like this, and as with any building game, there are some superb examples out there. For another downside, there isn’t a lot to do; as far as I can tell, there isn’t a way to add new maps or new challenges, which is a shame. Either one would add a ton of longevity to the game. Finally, the combat bits could be expanded upon a little—damage is very binary right now, and hitting a plane with anything will usually pop it.

With that out of the way, let’s talk about the good. I’m going to do this by discussing some of the things I have built; namely, the aircraft carried by the zeppelin Inconstant, from Skypirates: the Kestrel, Falcon, Vulture, Albatross, and Gorcrow. All are based off of real-world designs. The Kestrel is a riff on the XP-55 Ascender, the Falcon is based on any number of (generally French) twin-boom pusher designs of the immediate prewar and postwar periods, the Vulture is a recreation of the Sh-Tandem, a Russian ground-attack design, the Albatross is a Blohm & Voss-inspired asymmetric design, and the Gorcrow is more or less every medium bomber between 1930 and 1945. (Note that I made a few modifications to fit my zeppelin-borne aircraft requirements and restrictions, which you’ll find at the end of this post.)

The Kestrel is one of my favorites, owing to its handling characteristics. The twin coaxial engines, with a total of 1,500 horsepower for only 6,000 pounds full load, push it to speeds in excess of 400 miles per hour. It fields an excellent anti-aircraft punch, and has superb maneuverability at high speeds. Its weakness comes mainly in its low-speed handling: its vertical stabilizers are small, to limit the drag they add, but this creates a prominent tendency to yaw instability at landing speed. As such, it’s a design that’s likely very prone to landing mishaps, and requires a steady hand on the stick and active feet on the pedals to put onto the skyhook. Though the design is unusual, it flies very well, responding smoothly with little adverse yaw or other undesirable handling characteristics. At the edges of its envelope, it can sometimes get the pilot into trouble; unrecoverable flat spins are a possibility.

In design, the Falcon is much more conservative: it treads on no unusual aeronautical ground. The twin-boom design provides some added damage resistance; losing the back of one boom isn’t immediately fatal. It’s powered by a 1,250-horsepower engine, about the largest single engine we can expect to find in the world of Skypirates, and has a maximum takeoff weight of about 9,000 pounds. (The version posted is overweight, and needs to be slimmed down.) With rather a lower power-to-weight ratio, it only reaches about 320 miles per hour, significantly slower than the Kestrel. Although its gun armament is less heavy than the Kestrel’s, it makes up for that loss in firepower by mounting several racks for air-to-air and air-to-ground rockets. Its flight characteristics befits its character: rugged and dependable, with very few surprises, although it does have a tendency to stall the lower wing in tight, low-speed turns.

The Vulture is probably the one whose looks most closely match its intended purpose. A light bomber and ground-attack plane, the Vulture is the usual aircraft of choice when Inconstant needs to make a show of force. Its unusual design gives it a great deal of lift for its footprint, and permit all of its hardpoints to be placed along the same axis as its center of mass: dropping weapons doesn’t change its balance at all, making it a forgiving platform when carrying large weapons. The centerline mount supports an aerial torpedo, but only when the plane is air-launched—aerial torpedoes are too long otherwise. (Note that Inconstant doesn’t carry Vultures equipped with landing gear.) To my surprise, the Vulture’s handling is docile in the extreme, even when fully loaded, and turns downright peppy when empty, even though it only sports a 1,000-horsepower engine. I ran into no surprises anywhere in the envelope.

The Gorcrow, powered by a pair of 700-horsepower engines, is a conventional medium bomber, with all that implies. Its handling is ponderous, but it can sling a heavy load of bombs or rockets, or three aerial torpedoes, making it Inconstant‘s heaviest hitter by far. Three gun positions, one at the back of each engine nacelle, and one atop the fuselage, round out its weapon fit. Again, an unsurprising performer—not spritely, and predictable in its handling. Unlike the other aircraft on the list so far, its bringback weight is somewhat less than its full fuel empty weight. Inconstant being fairly light on avgas stores, her Gorcrows are generally only launched when absolutely necessary, to avoid having to dump fuel overboard before landing. The in-universe version has a glazed nose, but I haven’t figured that out yet.

The Albatross, powered by two 800-horsepower engines, is a long-range transport aircraft, and also one of my favorites for its sheer unlikeliness. Although Herrs Blohm und Voss built similar aircraft for the Luftwaffe during the Second World War, I was a little concerned that the flight engine wouldn’t handle it well, given the presumably-complicated aerodynamics at play. To my surprise, it worked fine, and isn’t even particularly touchy. Anyway, the 1,600 combined horsepower pushes her to a good turn of speed when empty, nearly as fast as the Falcon, and pegs her total cargo capacity at just over four tons. The asymmetry does mean she has some slight balance concerns, but in-universe, it’s easily trimmable. Low-speed handling is good, thanks to the fat wings. Even with the asymmetric nature of the pitching and yawing forces, owing to the offset position of the empennage, it has surprising maneuverability when empty. Same remark about the glazed nose.

Now, I didn’t even get into the built-in challenges, or into serious modding. I was just messing around, and in the course of learning how to build airplanes, building these, and coming up with my flight reports, I got more than my $10 of fun. I also got at least $10 of storytelling value out of it: I now have quirks and flight characteristics in mind better for each of these planes than I did before, and I can work that into stories.

If you’re looking for a plane construction sandbox, look no further.

Fishbreath’s Zeppelin-Borne Aircraft Construction Rules for SimplePlanes

  1. Airframes should range between about 3 tons and 12.5 tons full load.
  2. Aircraft must be shorter than 70 feet and have a wingspan less than 110 feet.
  3. No single engine may develop more than 1250 horsepower.
  4. Aircraft must have a bringback-weight stall speed of 110mph or less. (The other 20-30mph to get down to zeppelin speed is assumed to come from flaps.)

On tafl: what next?

In my last tafl post, I remarked that I’m nearing the end of my current OpenTafl sprint. Now, it’s wrapped up. Here’s what I finished:

  1. Write about 70% of a new evaluation function. It’s designed to easily accommodate tweaks and modifications, which will undoubtedly be required as I learn more about the game and uncover positions that the current function evaluates incorrectly.
  2. Move OpenTafl into its own repository, so I can publish the source.
  3. Package and test OpenTafl for binary distribution.
  4. Rewrite exploration to generate and sort moves prior to generating states.

All three of these are done, and OpenTafl downloads and source releases can be found at the Many Words Softworks site for OpenTafl. Play and enjoy. For questions, comments, or bug reports, please leave a comment, or drop me an email. (My contact information is in the About section at the main page.)

Here’s what’s on the list for the next sprint:

  1. Support for easy position entry. This will require creating some sort of short-form positional notation, akin to FEN from the chess world.
  2. External AI support, a la several chess engines. This will require defining a protocol by which OpenTafl can communicate with an external AI, including communicating things like the rules of the variant in question.
  3. Partial time control, so that OpenTafl can be limited to a certain number of seconds per turn.
  4. Some search improvements and tweaks. The prior three features are important stepping stones to this one; they’ll let me test OpenTafl against itself by quick-play tournaments, and therefore verify what sort of impact my changes are having.

As you can see, the next sprint is less about the nitty-gritty AI research and more about foundational work in developing OpenTafl as an engine for future AI research—my own and others. Ultimately, that’s a more important task.

On tafl: more optimizations

Having spent quite a lot of time on another code project, it’s high time I got back to a little tafl.

Remember where we left off: board states represented as arrays of small integers, and taflman objects inflated as needed during game tree search. It turns out that inflating and deflating taflmen is also a potential source of slowness, so I went ahead and removed that code from the game. (It also greatly simplifies the process of moving from turn to turn.) Wherever I would pass around the old taflman object, I now pass around a character data type, the sixteen bits I described in the previous tafl post.

This revealed two further slowdowns I had to tackle. The first, and most obvious, was the algorithm for detecting encirclements. For some time, I’ve known I would need to fix it, but I didn’t have any ideas until a few weeks ago. Let’s start with the old way:


for each potentially-surrounded taflman:
  get all possible destinations for taflman
    for each destination:
    if destination is edge, return true
return false

Simple, but slow. In the worst-case scenario, each taflman has to consider every other space on the board, and look briefly at every other space from each rank and file (where n is the number of taflmen and b is the board dimension):

O(n \times b^2 \times 2b) = O(nb^3)

I was able to cut this down some by running an imperfect fast check: if, on any rank or file, the allegedly-surrounded side has the taflman nearest either edge, that side is not surrounded. That had implications for a better algorithm, and sure enough, I hit on one.


list spaces-to-explore := get-all-edge-spaces
list spaces-considered := []
for space in spaces-to-explore:
  add space to spaces-considered
  get neighbors for space not in spaces-to-explore and not in spaces-considered
  for each neighbor:
    if neighbor is occupied:
      if occupier belongs to surrounded-side:
        return false
      else continue
    else:
      add neighbor to spaces-to-explore
return true

More complicated to express in its entirety, but no more difficult to explain conceptually: flow in from the outside of the board. Taflmen belonging to the surrounding side stop the flow. If the flow reaches any taflman belonging to the allegedly surrounded side, then the allegedly surrounded side is, indeed, not surrounded. This evaluation runs in a much better amount of time:

O(b^2)

Astute readers might note that, since the board dimension is never any larger than 19 at the worst, the difference between a square and a cube in the worst-case runtime isn’t all that vast. Remember, though, that we need to check to see whether every generated node is an encirclement victory, since an encirclement victory is ephemeral—the next move might change it. Running a slightly less efficient search several hundred thousand to several million times is not good for performance.

I found another pain point in the new taflman representation. The new, array-based board representation is the canonical representation of a state, so to find a taflman in that state, the most obvious move is to look through the whole board for a taflman matching the right description. Same deal as with encirclement detection—a small difference in absolute speed makes for a huge difference in overall time over (in this case) a few tens of millions of invocations. I fixed this with a hash table, which maps the 16-bit representation of a taflman to its location on the board. The position table is updated as taflmen move, or are captured, and copied from state to state so that it need be created only once. Hashing unique numbers (as taflmen representations are) is altogether trivial, and Java’s standard HashMap type provides a method for accessing not only the value for a given key, but also a set of keys and a set of values. The set of keys is particularly useful. Since it’s a canonical list of the taflmen present in the game, it saves a good bit of time against looping over the whole of the board when I need to filter the list of taflmen in some way.

Those are all small potatoes, though, next to some major algorithmic improvements. When we last left OpenTafl, the search algorithm was straight minimax: that is, the attacking player is the ‘max’ player, looking for board states with positive evaluations, and the defending player is the ‘min’ player, looking for the opposite. Minimax searches the entire game tree.

Let’s look at an example, a tree of depth 3 with max to play and a branching factor of 3, for 9 leaf nodes. The leaf nodes have values of 7, 8, and 9; 4, 5, and 6; and 12, 2, and 3. We can express the minimax perspective on this tree mathematically:

max(min(7,8,9), min(4, 5, 6), min(12, 2, 3))

That is to say, max makes the move which gives min the least good best response. Remember, though, that we search the tree depth-first, which means that in the formula above, we move from left to right. This leaves us with some room for cleverness.

Consider this:

max(min(7,8,9), min(4, x, y), min(12, 2, z))

Does it matter what the numbers at x, y, and z are? No: by that time, we’ve already established that the subtrees in which they appear (with best possible values 4 and 2) are worse than the first subtree (with known value 7). Therefore, we can stop exploring those subtrees. This is known as ‘pruning’.

Pruning is good, because, when done correctly, it saves us work without dropping good moves. Alpha-beta pruning, the kind described above, is known to produce identical results to minimax, and, in the ideal case, can search twice as deep. (So far, it’s only given me one extra ply, though; the case isn’t quite ideal enough for more.) Other kinds of pruning may find their way in later, and will take some more careful thinking to avoid pruning away potentially good branches.

What is the ideal case, then? We want to explore the best moves first. A cursory examination of the example case above provides the reasoning. If we explore the subtrees in opposite order, we can do no pruning, and we get no speedup. This presents us with a chicken-and-egg problem: if we could easily and programmatically determine which moves are good, we wouldn’t need to bother with variants on full-tree search at all.

Fortunately, the structure of a tree gives us an easy out. Most of the nodes in any given tree are at the bottom. Consider a simple example, with a branching factor of two. At the first level, there’s one node; at the next, two nodes; at the third, four nodes; and at the fourth, eight nodes. In every case, there are more nodes at the deepest level than at all the levels above combined. The difference becomes more pronounced with greater branching factors: a tree with a branching factor of 30 (which approximates brandub) has a distribution of 1/30/900/27,000. So, it follows that the large majority of the work in a game tree is expanding the game states at the target search depth, and evaluating those states; the work higher up the tree is almost trivial.

Take that and add it to this claim: the best move at any given depth is likely to have been one of the best moves at the immediately prior depth, too. We can use those facts together to figure out the best ordering for our moves at our target depth: search to the depth prior, keep track of the best moves at that depth, and search them first. This approach is called ‘iterative deepening’: generally, you start at depth 1, and use results from previous levels to order each node’s children. OpenTafl accomplishes this with a structure I’m calling a deepening table, which is generated for each . The deepening table is a list of maps, one per depth. The maps relate a position hash to an evaluation done at its depth. On each deepening iteration, the exploration function generates successor states, then sorts them according to the data stored in the deepening table before exploring further. (There’s some easy performance on the table here—since creating a full-on successor state is a non-trivial operation, I can probably buy myself some easy speedup by generating moves alone, sorting those based on the positional hashes which would result from making those moves, and only generating states immediately before exploring them. This would probably save a good bit of time by never generating states which would have otherwise been cut off by alpha-beta pruning.)

Implementing the deepening tables put me on the scent of another major optimization, the transposition table. A transposition table stores values for previously-visited states, along with the depth each state was previously searched to. (If we’re searching six plies deep and previously encountered a state at depth four, that state was only searched to a depth of two. If we encounter the same state at depth two, then if we use the cached value, we’ve left two plies of depth on the table.) Hits in the transposition table early on in the search save us expanding whole subtrees, which yields many fewer node expansions required per search, and lets us reach deeper more easily.

The transposition table, since it persists through the entire game, must be limited in size by some mechanism. OpenTafl’s transposition table is an array of Entry objects, with a fixed size in megabytes. The array is indexed by positional hashes, modulo the size of the table. When indices collide, OpenTafl decides whether to replace the old value based on its age (older values are more readily discarded) and the depth to which it is searched (deeper searches are kept, if both are equally as fresh).

That brings me to the end of another marathon tafl post. Stay tuned for another one—probably a much shorter one—in the not-too-distant future, as I wrap up work on this sprint of OpenTafl, with an eye toward the first public code and source release.

On tafl: practical difficulties

In the past, I’ve written about the problem of tafl AI mainly in terms of its theoretical challenges. Those challenges remain, but can be mitigated with clever algorithms. (That work is in progress.) The theoretical challenges, however, are only half of the issue—and, I’ve been thinking, they may not even be the big half.

Consider go AI. Go is widely regarded as computationally intractable for traditional AI techniques. As such, go AI fits one of two patterns: probabilistic and machine-learning approaches, and knowledge-based approaches. The former is interesting in its own right, but doesn’t bear on our problem. OpenTafl is not currently built with an eye toward probabilistic or machine-learning techniques1. Rather, we’ll look at the more traditional school of go AI. The knowledge-based approach takes expert knowledge about go and applies it to traditional AI techniques. Rather than exhaustively searching the game tree—something which becomes intractable very quickly, given the branching factor in go—a knowledge-based AI applies heuristics and rules of thumb at each board state to prune uninteresting branches, and reduce the branching factor sufficiently so that the pruned tree can be explored to a useful depth.

How does one get this knowledge? One plays go. Knowledge-based AIs are generally written by good go players who are also programmers. The player/programmers use their domain knowledge to design sets of guidelines for deep exploration which work well, and to decide which sorts of starting states each guideline can be applied to. Though chess AIs deal with a much more tractable tree-search problem, they also use some knowledge-based augmentations to increase depth around interesting series of moves. Chess knowledge is a little different than go knowledge, however; there are formal answers to a large number of chess questions. Piece values are well-known, as are the pawn-equivalent values of certain important board control measures like pawn structure and rooks on open ranks.

Both of the best-studied games in AI, then, use some domain knowledge to push the search horizon deeper, whether by encoding the sorts of heuristics a certain player uses, or by encoding formal knowledge about the game itself. Chess goes further by packing in databases of solved endgames and good openings, which increases the effective search depth at interesting parts of the game2. Let’s think about applying these tactics to tafl.

Can we work like we might in go, using expert player knowledge to inform an AI? Yes, technically, but there’s a huge obstacle: tafl games are a tiny niche in the global board game market. My rough estimate is that there are perhaps a few thousand serious players. I wouldn’t be very surprised if fewer than ten thousand people have ever played a full tafl game, whereas I’d be highly surprised if more than a hundred thousand have. Writing about tafl tactics is therefore in its infancy5. There’s very little in the way of reading material that an average player like me could use to develop more in-depth knowledge, and, as far as I know, there are no other tafl players with a computer science background who are working on the problem of tafl AI.

Can we work like we might in chess, then? The answer here is simple: a categorical no. The amount of formal knowledge about tafl games is even smaller than the amount of non-formal knowledge about them6. Endgame databases infeasible7>, and nobody has carried out study of good openings for various taflman arrangements on boards of various sizes. Beyond that, we can’t formally answer questions about even the most elementary pieces of tafl strategy. In chess, I can say with some certainty that a rook is worth about five pawns. Given that favorable board position and pawn structure are only worth about a pawn or a pawn and a half, if I’m going to trade my rook for good board position, I should try to get at least a minor piece (a knight or bishop) and a pawn out of the bargain, too. Tafl is a much more positional game, which makes questions like this harder to pose, but an analogue might be, “Is it worth it to move my king from c3 to make a capture and set up another, or should I hold the good position?”

Here’s another question we don’t have a good answer to: what is the value, in terms of win percentage, of an armed king captured on four sides (a strong king) against an armed king captured on two (a weak king)? Certainly, we know that an armed king is an extremely powerful piece in material terms. The king’s side has a nearly 2-1 disadvantage in material, but in many common 11×11 variants with a strong armed king, he wins at about a 2-1 rate in games between expert players8. On the other hand, 9×9 Sea Battle, a variant with an unarmed strong king escaping to the edge and no hostile central square, is still biased about 2-1 in wins toward the king’s side. On the other other hand, 9×9 Tablut, the variant on which all other variants are based9, features an armed strong king with a corner escape and a hostile central square, and is approximately as evenly balanced as chess. We can’t even answer questions about a king’s worth, or about the shift in balance provided by various rules tweaks.

So, the formal approach, for now, is out, and we’re left with knowledge-based approaches. This presents me with a variant of the chicken-and-egg problem. I have one chicken (a tafl engine), but I lack a rooster (a strong AI) with which to produce a viable egg (answers to interesting questions). The task before me is this: use my mad scientist’s toolkit (my brain, my passing familiarity with tafl games) to cobble together some sort of Franken-rooster that get my chicken to lay eggs, which will in turn produce less monstrous roosters.

Leaving aside tortured metaphors, it’s relatively easy to get an AI to a moderate level of play. All you have to do is apply some well-known algorithms. On the other hand, getting it past that moderate level of play does require knowledge, and the knowledge we have is not yet sufficient to do that. Hopefully my work on this topic will push us further down that road.

1. I’ve been writing OpenTafl with the intention that it be relatively easy to swap in different kinds of AI, both so that I can continue my research into tafl AI, and so that others can contribute. I suppose it may also find a place as a teaching tool, but that seems more unlikely.
2. Consider: at the start of the game, you can pick a common opening line, assume the other player will follow it to the end, and search from there, until the other player goes off-book. At the end of the game, you can search down to positions with a number of pieces—chess programs usually store a database of all endgames with five pieces, these days3—from which point you’re assured of perfect play. In either case, your effective search depth is your actual search depth plus the amount of information in the database.
3. Storage is the limiting factor. A five-move endgame database in the widely-accepted, heavily-compressed standard format is about 7 gb. There’s a chess program out there called Shredder which manages to pack it into 150 mb, two percent of the standard format, by removing any information besides whether the position is eventually won or eventually lost, and playing along those lines4. A six-piece database takes 1.2 tb in the standard form, which would be 24 gb in the Shredder format. The seven-piece tables come to between 100 tb to 150 tb, the range I’ve seen across a few sources, which would be take between two and three terabytes. Not really the sort of thing you’ll find in Chessmaster 2016.
4. It seems to me that without a native ordering, this method may take more moves to come to a winning position, but I suppose you can impose some order on them by adding ‘how long’ to the won or lost evaluation function, without taking up much more space.
5. I’m aware of only one place with any real detail, Tim Millar’s site.
6. Which isn’t a value judgement, mind. Go knowledge is nonformal, and in many ways seems largely aesthetic, but does eventually lead to wins, which makes it valuable, if hard to use from a game theory perspective.
7. Calculating the chess endgames with 7 pieces on an 8×8 board took a significant amount of time on a supercomputer in Moscow in 2012. On a 9×9 or 11×11 board, a tafl endgame with 7 taflmen is trivial—games usually end well before the amount of material on the board gets down to the amounts where enumeration of endgames is possible.
8. I’m reminded of my introduction of tafl to my residence hall my senior year of college. The initial feeling was that the besieging side had a massive advantage, but as we developed as players, we realized the advantage was much more on the king’s side.
9. Linnaeus, on his expedition to Lapland, observed the game being played and recorded the rules, thus saving tafl games for the ages.

Fishbreath Plays: MechWarrior Online

The Battletech Kickstarter kicked me back into playing some MechWarrior Online. I ought to note that, on the whole, I like it. I don’t think it quite hits the heights that previous MechWarrior games hit, but it’s a perfectly serviceable stompy robots game in terms of gameplay. That isn’t what this post is about, though.

No, this post is a rant.

Let’s talk about the grind. Oh, the grind. That necessary evil in free-to-play games, right? Yes, but this one is particularly egregious. Fans of the game may try and tell you it’s less grindy than World of Tanks, say, or World of Warships. They are lying to you. Let us be perfectly clear. There is precisely one type of player for whom MechWarrior Online is less grindy than the Wargaming.net games: high-level competitive players. World of [insert vehicle] does indeed require more out of you than MechWarrior Online does if you want to experience what counts as the endgame. I would posit that these players, though, make up such a tiny proportion of the playerbase that their opinions are basically meaningless. Let’s get down to brass tacks with a comparison.

I’m a fairly casual player in World of Warships, and a fairly casual player in MechWarrior Online. As such, my goal in the former is to have a couple of historical Second World War vessels, which occupy (generally) the top half of the tech trees. I need to get to about tiers 4-6. I average about a thousand experience per game, or more if I’m playing primarily for first-win-of-the-day. To go from tier 5 to tier 6 costs about 30,000 experience, which comes to about 30 games. (Going from tier 1 to tier 5 is about that difficult too, if I recall correctly.) It takes me about 60 games—certainly, no more than 75 on a bad streak, and no more than 100 on a horrid streak. So far, I’ve never had to grind for money; in the course of getting through my 60-100 games to hit the next tier, I’ve made enough to buy my way up.

In MechWarrior Online, as a fairly casual player, my goal is to build up a stable of mechs of various sizes and roles I can switch between as the mood strikes me. There are two obstacles here. First, earnings: I make about 80,000-100,000 c-bills per match in MechWarrior Online. A single light mech chassis costs about 2,000,000 c-bills. (A little less for 20-ton mechs, a little more for 35-ton mechs, a whole lot more for mechs which can mount ECM—note that ECM capability usually makes a given mech variant the most desirable of its chassis.) A medium mech costs about 3,750,000 c-bills. You’ll spend between 5,000,000 and 6,000,000 for a heavy, and 7,500,000 to 10,000,000 for an assault.

You begin to see the magnitude of the problem. Buying a light, a medium, a heavy, and an assault chassis takes nearly 20 million c-bills and a shade under 200 games. Outfitting a mech can sometimes double that cost, especially if you don’t have a bunch of weapons and equipment sitting around your mech lab, if you’re a new player. We’re already up to about a 400-game grind to buy and outfit four individual mechs. That’s a big time sink.

It also isn’t the end of it. When you buy a mech, you unlock skill trees for that mech. Consider this: I earn about 800xp per match. If I don’t sink about 30,000 experience into each variant, that variant is between 10% and 50% worse in a variety of extremely important performance measures (speed, heat capacity, turn rate, and more) than someone else’s copy of that variant who has done the grind. That’s a 40-game grind in each mech after you’ve acquired it. I’ll grant you, that comes out to less than you need to acquire the mechs (a mere 160 games), but that isn’t the whole story.

You see, the mech skill trees come in three tiers. To unlock each tier, you need to finish the tier beneath it… on three separate variants of a given chassis. So, you don’t need 400 games and 40,000,000 c-bills to buy and outfit four mechs. You need 120,000,000 c-bills and 1200 games to grind out the requisite twelve variants to avoid being a gigantic drag on your team. More than 400 of those games will be played in mechs that are arbitrarily handicapped in an attempt to get you to buy premium time.

So no, Wargaming.net’s games are not more grindy than MechWarrior Online for the average player. Basically, if you don’t get summers off, you’re going to have to spend money if you want to fill out your mech stable, and a lot of money, at that. I leave gawking at the prices as a trivial exercise for the reader.

Thirty Minutes Over Toseong

Come, let us reminisce.

It’s the mid-2000s, and war has broken out over the Korean Peninsula. I climb into my F-16C Block 52, get her powered up, and listen to the radio chatter as the inertial navigation system aligns. The time is just after noon, and the tower welcomes a few flights back. Good. The war only started this morning, but losses of planes and pilots both have been heavy.

The INS is aligned, and the Data Entry Display, the little green screen beneath and to the right of my HUD, informs me that my takeoff time is in five minutes. I radio the tower, and they clear me to taxi to runway 20R at our airbase, Seosan, a little ways southwest of Seoul. In between lining up another flight for landing on runway 20L, the tower gives my little two-ship clearance for takeoff.

We’re loaded light today, four AMRAAMs and two Sidewinders, so I take off at military power to save on fuel. We turn east-northeast and climb. We’re to our rendezvous point in about ten minutes. A few miles behind us, the ground-attack flight we’re escorting slots in. I check in with Chalice 4, the AWACS flight covering our sector of the front, and he advises me of bandits to my west, heading this way. I dither for a moment: should I push west and deal with the bad guys, or stick close to my flock?

On the assumption that close escort is and always has been dumb, I turn west. I match the bullseye position on my radar MFD page to the position AWACS gave me, and sure enough, I see two contacts headed my way. I hand one off to my wingman, bug the other, get target confirmation from AWACS, and send the first AMRAAM on its way. It hits, and my wingman’s does too. AWACS calls picture clear, and I turn my flight back toward our charges.

Of course, this is a full-scale war. There’s no way our sector will keep quiet for a full half-hour. Sure enough, AWACS reports contacts about 60 nautical miles distant, inbound from the north. We have identifications: one or two MiG-23s in the middle of a big group of Il-28s. Neither are particularly fearsome, but the MiGs should go first on principle—the Ilyushins are ancient jet bombers, dating from the late 1940s or 1950s, while the MiGs have some 1970s-vintage missiles that might pose a threat to us if they get close enough.

This is where things start to get a little hairy.

It isn’t that engaging the MiGs and Ilyushins doesn’t work—it works just great. The problem is that we still have ten minutes left on station. AWACS calls out a close threat—so close they give us a bearing and range, instead of a bullseye call—and my wingman goes low to deal with the MiG-19 that apparently managed to get in underneath us. Having pushed a little further north than I wanted while prosecuting the bombers, I turn south to extend, and take stock of my situation. I’ve fired all four of my AMRAAMs, and I have a pair of Sidewinders left on the rails. My wingman splashes the MiG-19 down low, and I call him for his status. “Winchester,” he says, “fuel state 2000.”

Well, that’s no good. He’s out of everything and has enough fuel to make it back to Seosan. I send him home, and call AWACS to let them know I’m effectively empty.

“Lobo 9-1, Chalice 4, your window of vulnerability is still open. Can you hold out?”

Well, crap. “Wilco,” I reply, on the assumption that an F-16 with no long-range missiles at angels 25 looks, on radar, to be very similar to an F-16 with long-range missiles at angels 25.

Then the RWR chirps in a way I’m too familiar with. The new threat is a MiG-29. I call AWACS again. “Nearest contact bullseye 050, 60 nautical miles.” I fiddle with the horizontal situation display MFD page, and verify that those are indeed the Fulcrums, sixty or seventy miles out. I ask if I can go home again, and they insist I try to hold out a little longer. There isn’t really much holding out to be done, though. I can’t take on a pair of Fulcrums on my own.

I can stick around a little longer, pretending to be armed, though. I have one eye on the clock, one eye on my RWR, and both ears listening to AWACS updates. It’s agonizing—the MiGs are closer to a missile shot every minute, and I don’t have the fuel to run away at afterburner for very long. Finally, the clock ticks past the edge of my station time. I call AWACS and tell them I’m out of Dodge, and I put the MiGs on my six and run for home.

So ends this tale of a rarity in gaming: a moment when I was not only happy to escape, but actually planning on running away.

The games I’ve enjoyed the most over the years have a common theme: they make war stories. Whether it’s the one time in PlanetSide where I used a Galaxy to scrape the enemy off of a hilltop position, or that time I was stuck on-station with MiGs bearing down on me, it’s the sort of immersion that resonates with me the best. That’s the best thing I can say in favor of Falcon 4 BMS.

On tafl: game tree search optimizations

As you’ll recall from the tafl piece about complexity, tafl has a high branching factor compared to, say, chess1. That means that a tafl game tree is broad: it has a lot of nodes relative to its depth. Since searching deep into a broad tree is difficult, getting good lookahead out of a tafl engine is going to be difficult. And, to be clear on where this post is going, it started off being extremely difficult.

Before I get there, though, I want to talk about the state of the art in chess AI as of now, or at least a few years ago. (Recent sources are probably easy to find if I were to look in the right places, but I forgot I still have a .edu email address with which I can probably sneak into a bunch of online article repositories.) Chess engines, today, on a moderately powerful desktop computer, are able to generate about four million game states per second. They also have highly advanced pruning2: not only the standard sort I’ll go into in a later post, but also a more qualitative sort, discarding moves that pass the plausibility test but fail the usefulness test.

Four million states per second is an extremely high bar to hit. Chess engines have been in development since the dawn of digital computing, just about, and fifty to sixty years of optimization really shows3. I told myself I’d be happy with somewhere between ten and twenty-five percent of chess’s state of the art.

How far did I have to go? Well, I coded up my state explorer, put a bit of benchmarking code in, and set it exploring three ply deep in brandub. It chugged along for a while, and then, almost forty seconds later, finished. The number? 1,000 states per second, which comes to between four hundred and one thousand times too slow.

Ouch.

I quickly discovered a few obvious things I could fix. First, I was doing an expensive operation (getting the allowable moves for a piece) for every piece on each side, to check if either side had no allowable moves remaining. I replaced that with a short-circuit check that returns true as soon as it discovers any allowable move for either side, which is a much quicker proposition. Second, I was checking for edge forts at the end of every turn, whether or not the game allowed for edge forts. I put a stop to that. Third, I was checking for edge forts in a vastly more expensive way than I needed to, to allow for a situation that does not occur in any known tafl variant4.

Confidently, I ran the benchmark again: 4,000 states per second.

Well, I’d been hoping for more than that.

I did a little reading, and decided some of my problem was with the programming language I’d selected. My language of choice for most of my side projects is a scripting language called Groovy. It’s Java’s little brother, sort of; it provides a bunch of shortcuts Java does not for displaying things to the terminal, handling user inputs, playing around with files, and iterating over sets of objects. I suspected from the start that Groovy might cause me headaches, and profiling suggested that was correct—my code was spending most of its execution time in Groovy’s internals. Rewriting it in Java was quick, as such things go, and with that squared away, and me optimistic as to my chances, I set up another benchmark.

15,000 states per second.

Three times faster is great, but I still had a few orders of magnitude to go.

I decided that the board representation I had started with was clever, but inefficient. My profiler suggested I was using literally tens of millions of objects: each space was a first-class object, in the first reckoning. This would not do; objects in Java may be small, and instantiating them may be fast, but there’s a limit to what a Java virtual machine can be expected to put up with. I removed the class of object which represented spaces, and set about rewriting the code to reference an array of small integers to represent a board.

Moving to an integer representation meant that I had an opportunity to come up with an encoding I can eventually use to rid myself of helper objects altogether. (As I write this, I’m still using a class of objects to represent taflmen—they’re inflated from the integer representation when needed, and compressed when the state is no longer active during a game tree search.) The first six bits of the integer are used for an ID number—each taflman-representation has a unique ID within its side. Six bits lets me define up to 127 taflmen per side, which is sufficient for every known variant, and leaves some room for busier alea evangelii variants, if desired. The next three bits define the piece’s type. I’m aware of five potential types of taflman: a standard taflman, a knight or a commander from berserker, a mercenary (a defending piece which switches sides when captured, seen in one proposed alea evangelii variant), and a king. Finally, the next bit defines whether the piece is a besieger or a defender.

I was a little put out to realize that my efforts did not actually buy me very much—I was still at about 15,000 states per second.

My next task was to observe the behavior of my code, and find out where it was performing poorly. An unexpected but obvious optimization presented itself. When you run a Java program, it defaults to using a small amount of memory. You can provide command line switches to allow it to use a much larger amount of memory. Doing that, I found I was up to 50,000 states per second—that’s getting somewhere.

In fact, that’s what I’d call having gotten somewhere. 50,000 states per second is sufficient to play a game of brandub to a search depth of 3, without undue waiting. (It takes about 40,000 states explored to get to depth 3, and about 1.2 million to depth 4.) I whipped up a quick evaluation function (which I’ll describe in a subsequent post), hooked it up, and quickly found that it played only slightly more poorly than I do5.

With that heartening development, I also stumbled upon a happy fact. Java is, at first, an interpreted language, but it compiles sections of programs to faster, native code when it discovers that it runs into them a lot. After the computer took its first turn, I saw the benchmark report that it had explored 200,000 states per second. 200,000 states per second I can live with—my laptop is several years old and a laptop, and in testing, my desktop was about five times faster in an earlier version of the code. I intend to test it again soon, and see what a powerful machine can do with it.

I made one final optimization after that. Since the AI code used a thick representation of game states, it was extremely memory-hungry—brandub to depth 3 took over four gigabytes of memory, and other games were impossible altogether. I solved that one with two changes: first, I now represent the explored nodes of a tree as a list of moves required to reach each node from the root. Second, I trigger garbage collection manually.

Now, this may seem like a bad idea, to those of you with some Java development experience, but in this case, it’s called for. The default Java garbage collection scheme is two-leveled: objects just created go in a group called the Eden space. The Eden space contains objects whose longevity is not known, and which will therefore likely be quickly removed. If an object survives garbage collection in the Eden space, it moves into the Survivor space, where middle-longevity objects live. The Survivor space is touched less frequently by garbage collection. Finally, if an object lives in the Survivor space through a couple of garbage collections, it moves into the Tenured space, where it is troubled by garbage collections least frequently of all.

A lot of my objects end up in the Tenured generation—the objects which represent boards, particularly. They’re long-lived and durable, while almost every other object in the code comes and goes somewhat quickly. That means that Board objects, which are rather large, and which also contain references to other, shorter-lived objects, stay in memory more often than I’d like, and keep their sub-objects in memory, too. Triggering a garbage collection manually garbage collects everything, regardless of the space in which it lives, so the Board objects are cleaned up, freeing space. It does slow the code down somewhat: I get about 120,000 states per second on my laptop now, under ideal conditions.

Like I said, I can live with that. It’s not where I want it to be, but it’ll do for the time being. Coming up in the not-too-distant future: writing on evaluation function, although I have some research to do before I’m done with that.

1. Go starts out branchier than 11×11 tafl variants, obviously, since you can play a stone at any point on the board. Thinking in terms of smart opening moves, I suspect the order still goes go, 11×11 tafl, chess. Large (that is, 19×19) tafl variants probably branch more than go at almost every stage of the game; the board is the same size.
2. That is, trimming unproductive branches from the game tree before expending the effort to explore them.
3. Here’s one for you. Computer memory is organized into bytes: eight slots to hold either 0 or 1 (called ‘bits’). Some data types are eight bytes long, or 64 bits. The chess board is 64 spaces, so someone had the idea to represent chess boards in a single eight-byte long integer. You need a few of these bitboards to represent a whole chess game state, obviously: minimally, two to represent which side controls each space, and six to represent which kind of piece occupies each space. Even so, it’s very compact, and bitwise operations on 64 bits at a time are blazingly quick on today’s 64-bit CPUs.
Unfortunately, this does not generalize well to tafl; most of the boards we concern ourselves with are larger than 64 spaces.
4. Namely, allowing black, using Berserker-style pieces, to jump into a Copenhagen edge fort to break it.
5. This says a lot more about me than about the AI.

On tafl: AI basics

My last tafl post was deeply theoretical. Very shortly, this one is going to take a turn for the deeply practical. First off, though, there’s some theory housekeeping I need to handle. Because the Copenhagen and Fetlar variants are both difficult AI problems, and because Copenhagen has some additional rules which are computationally hard to check, I decided to start with brandub, a much simpler tafl variant. Brandub comes from Ireland, and is played on a 7×7 board. The king’s side has five pieces, and the besieging side has eight, arranged in a cross. The king participates in captures, and is captured by two besieging pieces instead of four. He escapes to the corners. Any piece may cross the center square, but only the king stops on it.

Computationally, brandub is in a much easier class than the 11×11 variants I’ve discussed. The upper bound on its state space complexity is a mere 4.4 x 1015, using the estimation technique I described in the previous post; for comparison, checkers1 is on the order of 1020. Its game tree complexity is at least 1.3 x 1036, which is also not overwhelmingly distant from checkers’ complexity (though harder—tafl’s state space complexity is frequently lower than comparable games, but its game tree complexity is frequently higher). This makes it a good place to start—for one, it means that it doesn’t matter how slow my code runs.

So, to kick things off, I started with what the business calls GOFAI—good old fashioned AI—techniques. The simple version is this: you play every possible move from your current position to a few moves out, and pick the best outcome. It ends up being a little more complex than that, though. If you’re not curious, you can skip the next few paragraphs.

The particular GOFAI technique I’m using right now is a straight implementation of the minimax algorithm. After generating a game tree from the current state to a given search depth, the algorithm evaluates the tree’s leaf nodes. An evaluation is a single number, describing whether the state is good for the besieging side (nearer positive infinity) or good for the defending side (nearer negative infinity). The details aren’t important just yet, although I’ll get into them later. Anyway, the algorithm is predicated on the assumption that both players are perfect—they’ll always make the best move they can make, given the situation. It follows, therefore, that the best move at a certain depth is the one which leads to the best state for him: the most negative for the defender, the most positive for the besieger.

As an example, with the besieger to move and a search depth of three, there are four levels to the tree2:

1. A position with the besieger to move.
2. All of the positions, defender to move, resulting from every one of the besieger’s possible moves.
3. All of the positions, besieger to move, resulting from the defender’s possible responses to those moves.
4. All of the positions, defender to move, resulting from the besieger’s responses to the defender’s responses.

The algorithm starts at the tree’s leaves, the nodes at the fourth level above, and runs the evaluation function. It then moves up a level. The besieger is to move, and all of his possible moves have a child state which has an evaluation attached to them. The besieger picks the move which yields the best outcome for him, which is also the worst outcome for the defender. (Although the tafl board is asymmetric, the game, in the theoretical sense, is not—a good state for the defending side is a bad state for the besieging side, and vice versa.) The value of each node at the third level becomes the value of its best child (for the besieger). The algorithm then moves up to the second level. The defender is to move, and he has a set of moves and child states with evaluations attached, as we just derived. He picks the best outcome for himself, which is also the worst outcome for the besieger. The value of each node at the second level becomes the value of its best child (for the defender). The algorithm then moves up a level, and, at the first level, the besieger has a list of moves and child states with evaluations attached to them. He chooses the best move out of his options, and that’s the move he makes.

As you might have guessed, if you had a computer of infinite power, you could just explore the game tree until you found a win, loss, or draw for every sequence of moves from your starting state. In fact, we call games for which this is possible3 solved. The evaluation function for an infinitely powerful computer is extremely simple:

1. Has the besieger won in this state? Return positive infinity.
2. Has the defender won in this state? Return negative infinity.
3. Has neither of the above happened? Return zero.
4. Ignore draws for the purposes of this example, or declare them a win for one side, or something.

The algorithm will explore until every branch reaches an end state, then work its way up through the tree. Since it assumes perfect play, it will provide a sequence of moves which yields a victory for one side from the current position, and there’s nothing the other side can do about it—the whole game becomes inevitable. Perhaps fortunately, this is extremely hard for most games; tic-tac-toe is somewhat less thrilling now that it’s solved.

It also reveals a weakness in, or at least a caveat with, the minimax algorithm: since we do not have computers of infinite power (in fact, my current implementation is able to look ahead only one ply further than the example I gave above), its playing ability depends either on achieving deep lookahead to compete with humans, who can look ahead much more selectively than computers can, or on having a good evaluation function. I haven’t really attained either goal yet.

I see that I’ve already written about a thousand words, and I haven’t even made it out of the first point on my supposed outline for this post. I suppose I’ll just have to write a couple of ’em. Next time I’ll get into the practical considerations in implementing an AI algorithm like this. Probably the time after that, I’ll write about evaluation functions, and perhaps after that I’ll write about some improved algorithms. The latter requires that I actually implement them, though, so we’ll see how the timing works out.

1. 8×8 draughts, if you aren’t an American.
2. The term of art is ‘plies’, singular ‘ply’—since a response to a move is called a reply, clearly a single move must just be a ply. (Math and computer people, am I right?)
3. Wikipedia has some interesting reading on what it actually means for a game to be solved—the definition I’ve given is what’s referred to as a ‘strong’ solution, where a computer has brute-forced the entire game tree. There are two other kinds of solved games: those solved ‘ultra-weakly’, which is to say, by abstract mathematical reasoning, and those solved ‘weakly’, by some algorithm which does not explore the entire tree.

Pacific War and the recipe for compelling wargames

Pacific War

This is Pacific War, a 1992 release by Gary Grigsby. I’ll come back to that.

I’m on something of a Pacific Theater kick, for an unusual reason: over at the Something Awful LP Archive, I’ve been reading an AAR by a guy called Grey Hunter. The game is War in the Pacific: Admiral’s Edition, the Grigsby/Matrix magnum opus, covering the whole war in excruciating detail. It’s the kind of thing I would love, if I wanted to drop the $80 on the price of entry, and the several hundred hours it would take me to actually get through the war. Fortunately, I can read the 1,320 entries in Grey Hunter’s AAR much more quickly than I can play the game myself, and so get a feel for the flow of things more than the day-to-day minutiae of supplying the twelve-man garrison on Rarotonga. I’ll come back to that.

One of my favorite wargames, as you’ll know if you follow Many Words Main in the slightest, is the Command Ops series. Its conceit is that, as the player, you have to deal with all the handicaps real field commanders had to deal with. Your orders have to percolate down to your subordinate commanders, and hours pass as they dawdle (from your perspective) to plan a simple attack on a defensive position. You tear out your hair, watching the map as they miss the enemy battalion just behind the hill and happily claim success after wiping out a poor company of engineers. You celebrate your own cunning when your men just finish setting up when an enemy attack lands. It is, as the title says, a compelling wargame. But what does that mean, really? ‘Compelling’ is one of those review words1 that doesn’t mean anything absent a better definition. Let me explain what I’m trying to convey.

Ian W. Toll, one of my favorite naval historians2, wrote a book about the first six months of the Pacific War. I’ve always regarded Toll’s biggest talent as immersion. He writes vividly, with a knack for putting the reader into the mind of the people who were there. Near the beginning, he quotes Chester Nimitz, on his experience in the first few months of the war: “From the time the Japanese dropped those bombs on December 7th until at least two months later, hardly a day passed that the situation did not get more chaotic and confused and appear more hopeless.” That gets at the crux of it, I think. I’m still in the first few turns (weeks) of my game of Pacific War, and I don’t think I’ve ever felt more despairing while playing a game. The deck is stacked so heavily against the Allies in the opening weeks: the Japanese are everywhere, and you categorically lack the planes, men, and ships to do anything about it. It has a very strong sense of place, and that is compelling. Command Ops has it, too: in the AARs in the archives at Many Words Main where I follow along in a history book, I find myself worrying about the same sorts of things as commanders on the field did. I find myself feeling exactly how CINCPAC Nimitz did.

I mentioned I’d be coming back to War in the Pacific. As I hinted, I don’t own the game, and I’ve never tried it. It’s possible I might find it compelling in the same way, but reading through Grey Hunter’s AAR, I think it’s not a settled thing by any means. In the course of spending forty-five minutes to an hour working through a single day, I think I might lose sight of the bigger picture, and games about the Pacific War are ultimately games about an entire theater. The sense of place flows out of watching the long arc of progress, and at a certain scale, that’s hard.

So, for me, compelling wargames capture a sense of place. That doesn’t preclude them from being grognardy, but it does preclude them from swamping the feel with detail. Pacific War is just about right, I think—accurate enough to yield plausible results, broad enough in scope to engender the emotional investment I crave.

Pacific War is freeware, courtesy of Matrix Games. You can find it at their website, or at my mirror.

  1. See this video, starting at about 3:00, or 4:30.
  2. Objection, your honor. Relevance3.
  3. Overruled. I want to see where this is going. Counsel, make your point quickly.