Tag Archives: coding

USPSA Elo Ratings: how do they work?

In USPSA circles these days, I’m known in roughly equal measure for being ‘that East Coast revolver guy’ and ‘that Instagram Elo ratings guy’. This post is about the second topic. In particular, it’s a simple, mostly non-mathematical primer about Elo ratings in general, and Elo ratings as I have applied them to USPSA with the tool I call USPSA Analyst1.

This article is up to date as of September 2023. It may not be up to date if you’re reading it later on, and it may not be fully complete, either—this is not meant to be full documentation of Analyst’s guts (that’s the source code), but rather a high-level overview.

Elo basics

The Elo2 rating system was first developed to measure the skill of chess players. Chess is what we call a two-player, zero-sum game: in tournaments, each chess game is worth one point. If there’s a draw, the two players split it. Otherwise, the winner gets it. Elo operates by predicting how many points a player ought to win per game, and adjusting ratings based on how far the actual results deviate from its predictions.

Let’s consider two players, the first with a rating of 1000, and the second with a rating of 1200. The second player is favored by dint of the higher rating: Elo expects him to win about 0.75 points per game—say, by winning one and drawing the next, if there are two to be played. If player 2 wins a first game, his actual result (1 point) deviates from his expected result (0.75), so he takes a small number of rating points from player 1. Player 2’s rating rises, and player 1’s rating falls. Because the deviation is relatively small, the change in ratings is relatively small: we already expected player 2 to win, so the fact that he did is not strong evidence that his rating is too low. Both players’ ratings change by 10: player 2’s rating goes up to 1210, and player 1’s rating goes down to 990.

On the other hand, if player 1 wins, his actual result (1 point) deviates from his expected result (0.25) by quite a lot. Player 1 therefore gets to take a lot of points from player 2. We didn’t expect player 1 to win, so his win is evidence that the ratings need to move substantially to reflect reality. Both players’ ratings change by 30, in this case: player 2 drops to 1170, and player 1 rises to 1030.

The amount of rating change is proportional to the difference in rating between the winner and loser. In Elo terms, we call the maximum possible rating change K, or the development coefficient3. In the examples above, we used K = 40 multiplied by the difference between expected result and actual result4: 40 \times 0.25 makes 10, for the rating change when player 2 wins, and 40 \times 0.75 makes 30, for when player 1 wins.

Expected score can be treated as an expected result, but in the chess case, it also expresses a win probability (again, ignoring draws): if comparing two ratings yields an expected result of 0.75, what it means is that Elo thinks that the better player has a 75% chance of winning.

There are a few things to notice here:

  • Standard Elo, like chess, is zero-sum: when one player gains rating points, another player must lose them.
  • Standard Elo is two-player: ratings change based on comparisons between exactly two players.
  • Elo adjusts by predicting the expected result a player will attain, and multiplying the difference between his actual result and expected result by a number K.
  • When comparing two ratings, Elo outputs an expected result in the form of a win probability.

Elo for USPSA

Practical shooting differs from chess in almost every way5. Beyond the facially obvious ones, there are two that relate to scoring, and that thereby bear on Elo. First, USPSA is not a two-player game. On a stage or at a match, you compete against everyone at once. Second, USPSA is not zero-sum. There is not a fixed number of match points available on a given stage: if you win a 100-point stage and another shooter comes in at 50%, there are 150 points on the stage between the two of you. If you win and the other shooter comes in at 90%, there are 190 points.

The first issue is simple to solve, conceptually: simply compare each shooter’s rating to every one of his competitors’ ratings to determine his expected score6. The non-zero-sum problem is thornier, but boils down to score distribution: how should a shooter’s actual score be calculated? The article I followed to develop the initial version of the multiplayer Elo rating engine in Analyst has a few suggestions, but the method I settled on has a few components.

First is match blending. In stage-by-stage mode (Analyst’s default), the algorithm blends in a proportion of your match performance, to even out some stage by stage variation7. If you finished first at a match and third on a stage, and match blend is set to 0.3, your calculated place on that stage is 1 \times 0.3 + 3 \times 0.7 = 2.4, rewarding you on the stage for the match win.

Second is percentages. For a given rating event8, the Elo algorithm in Analyst calculates a portion of actual score based on your percentage finish. This is easy to justify: coming in 2nd place at 99.5% is essentially a tie, as far as how much it should change your rating, compared to coming in 2nd place at 80%. The percentage component of an actual score is determined by dividing the percent finish on the stage by the sum of all percent finishes on the stage. For instance, in the case of a stage with three shooters finishing 100%, 95%, and 60%, the 95% finisher’s percentage contribution to actual score is 95 / (100 + 95 + 60) = 0.372.

The winning shooter gets some extra credit based on the gap to second: for his actual score, we treat his percentage as P_{1} \div P_{2}, where P is the percentage of the given shooter. For example, \frac{100}{95} / (\frac{100}{95} + 95 + 60), for an actual score of about 0.404 against 0.392 done the other way around9.

Analyst also calculates a place score according to a method in the article I linked above: the number of shooters minus the actual place finish, scaled to be in the range 0 to 1. Place and percent are both important. The math sometimes expects people who didn’t win to finish above 100%, which isn’t possible in our scoring, but a difficult constraint to encode in Elo’s expected score function. (Remember, the expected score is a simple probability, or at least the descendant of a simple probability.) Granting points for place finish allows shooters who aren’t necessarily contesting stage and match wins to gain Elo even in the cases where percentage finish breaks down. On the other hand, percentage finish serves as a brake on shooters who win most events they enter10. If you always win, eventually you need to start winning by more and more (assuming your competition isn’t getting better too) to keep pushing your rating upward.

Percent and place are blended similar to match/stage blending above, each part scaled by a weight parameter.

That’s the system in a nutshell (admittedly a large nutshell). There are some additional tweaks I’ve added onto it, but first…

The Elo assumptions I break

Although the system I’ve developed is Elo-based, it no longer follows the strict Elo pattern. Among a bevy of minor assumptions about standard Elo that no longer hold, there are two really big ones.

Number one: winning is no longer a guaranteed Elo gain, and losing is no longer a guaranteed Elo loss. If you’re a nationally competitive GM and you only put 5% on the A-class heat at a small local, you’re losing Elo because the comparison between your ratings says you should win by more than that. On the flip side, to manufacture a less likely situation, if you’re 10th of 10 shooters but 92% of someone who usually smokes you, you’ll probably still gain Elo by beating your percentage prediction.

Number two: it’s no longer strictly zero-sum. This arises from a few factors in score calculation, but the end result is that there isn’t a fixed pool of Elo in the system based on the number of competitors. (This is true to a small degree with the system described above, but true to a much larger degree once we get to the next section.)

Other factors I consider

The system I describe above works pretty well. To get it to where it is now, I consider a few other factors. Everything listed below operates by adjusting K, the development coefficient, for given stages (and sometimes even individual shooters), increasing or decreasing the rate of rating change when factors that don’t otherwise affect the rating system suggest it’s too slow or too fast.

  • Initial placement. For your first 10 stages, you get a much higher K than usual, decreasing down to the normal number. This is particularly useful when new shooters enter mature datasets, allowing them to quickly approach their true rating. Related to initial placement is initial rating: shooters enter the dataset at a rating based on their classification, between 800 for D and 1300 for GM.
  • Match strength. For matches with a lot of highly-classified shooters (to some degree As, mostly Ms and GMs), K goes up. For matches heavier on the other side of the scale, K goes down.
  • Match level. Level II matches have a higher K than Level I, and Level III matches have a higher K than Level II.
  • DQs and DNFs. When a shooter DNFs a stage (a DNF, in Analyst, is defined as zero time and zero hits), that performance is ignored for both changing his own rating, and contributing to rating changes for other shooters. If match blend is on in stage-by-stage mode, it is ignored for DQed shooters. In by-match mode, DQed shooters are ignored altogether.
  • Pubstomps. If the winning shooter at a match is an M or GM, two or more classes above the second place finisher, and the winner by 25% or more, his K is dramatically reduced, on the theory that wins against significantly weaker competition don’t provide as much rating-relevant information as tighter finishes. This mostly comes into play in lightly-populated divisions.
  • Zero scores. Elo depends on comparing the relative performances of shooters. One zero score is no different from any other, no matter the skill difference between the shooters, so the algorithm can’t make any assumptions about ratings based on zero scores. If more than 10% of shooters record a zero on a stage, K is reduced.
  • Network density. Elo works best when operating on densely-connected networks of competitors. The network density modifier (called ‘connectivity’ or ‘connectedness’ most places in the app) increases K when lots of shooters at a given match have recently shot against a lot of other shooters, and decreases K when they haven’t. In a sense, connectivity is another measure of rating reliability: someone with low connectivity might be an artifact of an isolated rating island, shooting against only a few people without exposure to shooters in the broader rating set.
  • Error. Rating error is a measure of how closely a shooter’s actual scores and expected scores have matched recently. If the algorithm’s predictions have been good, error gets smaller, and K also gets smaller. If the algorithm’s predictions have been bad, error gets bigger, and so does K, to help the rating move toward its correct value more quickly. This is an optional setting, but is currently enabled by default.
  • Direction. Direction is a measure of how positive or negative a shooter’s recent rating history is: positive direction means that recent events have generally led to increases in rating, with 100 meaning that every recent change is a positive change. Direction awareness increases K when a shooter has highly positive or negative direction and is moving in that direction, and decreases K when a shooter moves opposite a strong direction trend11. Direction awareness is an option, currently disabled by default.
  • Bomb protection. Bomb protection reduces the impact of single bad stages for highly-rated shooters. Because people with above-average ratings lose much more on the back of a single bad performance than they gain because of a single good performance, ratings can unfairly nosedive in the case of, say, a malfunction or a squib. Bomb protection attempts to detect these isolated occurrences, and reduces K significantly for them. Repeated bad performances lose protection, and allow the rating to begin moving freely again. Bomb protection is an option, currently disabled by default.

In 2024, bomb protection and direction awareness will be enabled for the main ratings I do. Notably, these make the ratings slightly less accurate at predicting match results. It’s sufficiently small an effect that it may not be statistically significant, and it also substantially improves the leaderboards according to the experts who have seen the output. At the same time, the predictions will continue to use the 2023 settings, since (again) they are slightly but consistently more accurate.

In any case, that’s the quick (well, quick-ish) tour of Elo ratings for USPSA. Drop me a comment here, or a DM on Instagram (@jay_of_mars) if you have questions or comments.


  1. The name may change at some point, if USPSA asks, to avoid stepping on their trademark. At the time of writing, I’m not affiliated with USPSA, and they have not publicly sanctioned my work (in either the positive or negative sense of ‘sanction’). 
  2. Not ELO— the system is named after its inventor, Hungarian-American physics professor Arpad Elo. It’s pronounced as a word (“ee-low”), not as an initialism (“ee-el-oh”). 
  3. You’ll find ‘development coefficient’ in other sources, mostly. I always call it ‘K’. 
  4. I will probably refer to these as ‘expected score’ and ‘actual score’ elsewhere in the article. They’re the same thing. 
  5. “You don’t say!” 
  6. Expected scores and actual scores are scaled so that they sum to 1 across all competitors. This is mainly for convenience and ease of reasoning. 
  7. Some, but not all. Stage-by-stage mode works better in my experience: the more comparisons an individual shooter has, the better the output, even if winning stages isn’t quite what we’re measuring in USPSA. 
  8. A rating event is a stage, in stage-by-stage mode, or a match in match-by-match mode. 
  9. Writing this article, I realize I should probably be making the adjustment for every shooter, but that’ll have to wait until the 2024 preseason12
  10. I’m a good example of the latter: Analyst isn’t willing to bump my rating by very much at most revolver matches, because it’s expecting me to put large percentages on the field. 
  11. The reasoning here is that a shooter whose rating matches his current level of skill should have a direction near 0, which is to say a 50-50 mix of positive and negative changes. 
  12. There’s probably a more correct way to generate percentage-based scores in general, but I haven’t set upon it yet, even if I have a few ideas of where I’m not quite on track. 

OpenTafl 2020: New Tree Search Horizons

That’s right, I’m getting back to a project I’ve left alone for a while: OpenTafl, my hnefatafl engine project. The only one of its kind, it is both a host for other, yet-unwritten tafl bots, and a tafl bot in itself.

Why the return after all this time? Well, two reasons. First, I had to move the whole project to Github recently, because Bitbucket is shutting down its Mercurial repositories. Second, I had a bit of a brain blast on how to do a particular kind of AI project I’ve been wanting to try for a while.

OpenTafl’s current AI uses an alpha-beta pruning search, which is only a hop, skip, and jump away from the original, mathematician’s tree search algorithm, minimax. In simple terms, alpha-beta pruning minimax plays out all possible variations to a certain depth, assuming each player plays optimally, and picks the move which leads to the best outcome. The pruning bit skips searching nodes which are provably less optimal.

Of course, knowing what the optimal move is depends on one of two things: searching the game tree all the way to the end, or searching to a lesser depth and evaluating the leaf nodes in the tree. The former is impossible for reasons of computational power, the latter is logically impossible1. Evaluation functions, as we call them, are always imperfect, and require an awful lot of domain knowledge to do well.

Because tafl is a poorly-studied game, there isn’t a lot of domain knowledge to encode into an evaluation function, which has always limited OpenTafl’s potential somewhat. There are further downsides to the alpha-beta search it uses, too, in particular that it can’t readily be multi-threaded2. So, what’s the answer?

Well, at least potentially, Monte Carlo tree search. Popular among go AIs (and used as the framework for DeepMind’s efforts in computer players for board games), the secret to MCTS is a bit of randomness and a preference for exploring interesting lines of play. Start at the root of the game tree, navigate through the nodes you’ve already seen. When you find a leaf node (that is, one with no children), you generate its children, then play a random game until someone wins or loses. At each tree node, track the win/loss ratio for the tree, and use Mathematics™ to guide your root-to-leaf exploration in future iterations.

Simple! Of course, tafl poses some unique obstacles to MCTS, as Tuireann of PlayTaflOnline.com discovered. The biggest issue is that random moves in tafl are very, very unlikely to do anything of interest—tafl branching factors are higher than branching factors in, say, chess, and there’s no space pressure like there is in go. (That is to say, if you keep making random moves in go, the game eventually ends.)

Tafl MCTS artificial intelligences need some way to guide the playout process (the process of playing to the end of a game). The modern approach for this is to either train a neural network on a large corpus of existing high-level games (tafl doesn’t have one), or train a neural network by playing it against itself for a long time (which I don’t have the budget for). Given those constraints, I set about inventing a measure which would permit me to make random-ish moves which nevertheless move the game forward.

I’m calling the measure King Distance to Victory, or KDV; it’s inspired by part of the evaluation function from J.A.R.L., one of the entrants to the 2016 OpenTafl Tafl Open (which is on the calendar again for 2020, AI programmers!). For a given game state, the KDV is the shortest straight-line distance from the king’s current position to a victory space, counting spaces occupied by allies as 2 and spaces occupied by enemies as 3 or 4. The defender’s goal is to bring KDV to zero. The attacker’s goal is to maximize the average KDV and minimize the variation in the KDV values3.

It isn’t a perfect measure, but it isn’t meant to be—rather, it’s a measure to push playouts toward states which end the game. On that ground, I think it’ll be successful. I also hope to write a quick classical evaluation function which uses the KDV measure exclusively, to see how it plays on its own, without the MCTS magic behind it.

More news to come as it is made.


  1. This proof is left as a trivial exercise for the reader. 
  2. To prove which nodes can be ignored, alpha-beta search has to evaluate them in order, for a given value of ‘order’. 
  3. This helps encapsulate the idea that the attackers should aim to surround the king and close the noose, rather than simply get in the way of his best move. Note as well the implication that each state has multiple KDVs: an edge-escape game has up to four (the king moves straight from his current position to each edge), and a corner-escape game has up to eight (the king moves straight from his position to an edge space, then from the edge space to one of two corners). 

Gravity, Graviton, Pendulum: a wireless hydrometer for homebrewing

I’ve been working hard on this project over the past week or two, and I put another week or two into it at the end of last year. Finally, though, it’s just about ready to show.

Gravity, Graviton, and Pendulum are the three components of an end-to-end wireless hydrometer system for homebrewers.

Graviton is a Golang server which manages batches and hydrometers. Gravity is a vue.js front end for Graviton. Pendulum is an ESP8266-based floating tilt hydrometer, with built-in calibration and automatic temperature compensation. Put them all together, and you get something a little like this:

dataflow

For each batch you have in progress, you get a dashboard with a chart showing measured gravity and temperature over time, along with apparent attenuation and current calculated alcohol by volume: everything you need to know about a batch of beer in progress.

Backstory

Obviously, I homebrew, or else I wouldn’t have started on this project at all. I homebrew with a friend, however, which means that wherever we brew, at least one of us is going to be remote. The ability to check on a beer remotely is therefore valuable to us.

There are other existing systems: the open-source iSpindle, which inspired this project; Tilt, a commercial floating hydrometer of the same sort as Pendulum and iSpindle; and BrewBuddy, a commercial product which solves the long-term power issue by replacing your carboy bung and dangling a sensor-only torpedo into the wort.

The commercial products are out because we homebrew in part because it’s cheaper than buying good beer, and a do-it-yourself solution is way cheaper (if you, like me, value your labor at near-zero). Why not an iSpindle, then? Because we brew in glass carboys, whose necks are a mere 29.5mm across at their narrowest points, and iSpindle uses an enclosing cylinder which won’t fit.

So, because no product out there fits our needs, and because Go, vue.js, and some very light electrical engineering are all useful skills, I decided to roll my own.

How it Works

Like all tilt hydrometers, Pendulum uses the interrelation between density and buoyancy to figure out the density of the medium it’s floating in. A cylinder with a weight at the bottom naturally floats at an angle. If the liquid is denser, the angle between the hydrometer and the vertical increases. If it’s less dense, it decreases.

Pendulum is calibrated by preparing a series of sugar-water solutions of known density, recording its measured tilts in those solutions, recording specific gravity readings from a calibrated hydrometer, and providing tilt-gravity pairs to Pendulum’s configuration interface. It does the required calculations internally.

calibration

Technical Details

Pendulum uses an ESP8266 microcontroller, a GY521 MPU6050 accelerometer/gyroscope breakout board, and a lithium-ion 18650 battery. Which precise ESP8266 board depends on how it fits in roughly 27mm tubes; I have several coming which will help me answer that question. The board I’m using for development has some nice features, like built-in USB battery charging and discharge protection, and if possible I’d like to stick with it.

As far as cylinders go, I have two options: a 27mm outside diameter jobber with a narrower screw cap, which will have to be hacksawed off, and a 27mm inside diameter tube used to hold collectible coins. The former may be too small on the inside, unless I detach the battery caddy from the development board, and the latter may be too big on the outside to fit into our carboy.

To-Do

In addition to the hardware task above, I have some work to do on the software side, too; some fixes to hopefully make the ESP8266 wifi connection slightly more reliable, and some changes and improvements to the web app and back end to allow for management of users and permissions.

Most of the hard work is already done. It’ll take about a month for the various enclosures to arrive from China, a week or two to work out the remaining hardware issues and perhaps add a transistor to the voltage measurement circuit, so it can be fully turned off. By mid- to late summer, I should have something release-ready, with enough documentation and photography so that anyone handy with a soldering iron should be able to assemble their own Pendulum. Until then!

OpenTafl v0.4.6.2b released

OpenTafl has seen some major development work for the first time in a while, culminating in the recent release of v0.4.6.2b.

This version adds some major features which have been on my to-do list for some time. First, the in-game and replay board views now support keyboard input, a major usability enhancement. No longer will mistaken command entries torpedo your games!

That feature, however, was merely a happy piece of good fortune, falling out of the true headline feature for the v0.4.6.x releases: a variant editor. That’s right. OpenTafl is now the best tool available for designing tafl rules, bar none. Not only can you individually tweak every rule described in the OpenTafl notation specification, you can also edit the board layout to look any way you like.

If you’re interested in playing a few games with the new UI or experimenting with rules variants all your own, you can, as always, get the latest version from the OpenTafl website. I haven’t promoted v0.4.6.x to the stable release yet, but I expect to do so soon.

With these features done, I turn my attention next to a few network-centric things for v0.5.x. OpenTafl’s network play server has not, to date, seen much use; now that PlayTaflOnline.com is getting close to its new architecture, I hope to write a PlayTaflOnline front end for OpenTafl, so you can use OpenTafl to play games at PlayTaflOnline, with all the rich support for replays, commentary, and analysis from OpenTafl. OpenTafl’s network server mode and self-contained network play will continue to be a supported mechanism for remote games, but won’t see new features. v0.5.x will also contain an automatic updater, to simplify the end-user updating process.

Looking further into the future, I’m running out of OpenTafl features I want to do. With luck, 2017 will see a v1.0 release.

How-To: Two USB Mics, One Computer, JACK, and Audacity

This post is largely superseded by a new post on the same topic. The new post includes more step-by-step setup instructions for an audio-focused Linux distribution, tips for how to use digital audio workstation tools to manage multiple microphones, and hardware recommendations for podcasting use.

The Crossbox Podcast is going upmarket: I now have two USB microphones, and for the March episode, parvusimperator and I will each have one directly in front of us. This is a wonderful advance for audio quality, but it does pose some problems:

  1. Audacity, our usual recording tool of choice (and probably yours, if you ended up here), only supports recording from one source at once.
  2. Though other tools support recording from multiple sources, the minor variations in internal clocks between two USB microphones mean that each microphone has a sample rate which varies in a slightly different fashion, and that longer recordings will therefore be out of sync.

Modern Linux, fortunately, can help us out here. We have need of several components. First, obviously, we need two microphones. I have a Blue Snowball and a CAD Audio U37, with which I’ve tested this procedure1. Second, we need a computer with at least two USB ports. Third, we need the snd-aloop kernel module. (If your Linux has ALSA, you probably already have this.) Fourth, we need JACK, the Linux low-latency audio server. Fifth, we need the QJackCtl program.

Before I describe what we’re going to do, I ought to provide a quick refresher in Linux audio infrastructure. If you use Ubuntu or Mint, or most other common distributions, there are two layers to your system’s audio. Closest to the hardware is ALSA, the kernel-level Advanced Linux Sound Architecture. It handles interacting with your sound card, and provides an API to user-level applications. The most common user-level application is the PulseAudio server, which provides many of the capabilities you think of as part of your sound system, such as volume per application and the ‘sound’ control panel in your Linux flavor of choice. (Unless you don’t use Pulse.)

JACK is a low-latency audio server; that is, a user-level application in the same vein as Pulse. It has fewer easily accessible features, but allows us to do some fancy footwork in how we connect inputs to outputs.

Now that you have the background, here’s what we’re going to do to connect two mono USB microphones to one computer, then send them to one two-channel ALSA device, then record in Audacity. These instructions should work for any modern Linux flavor. Depending on the particulars of your system, you may even be able to set up real-time monitoring.

  1. Create an ALSA loopback device using the snd-aloop kernel module.
  2. Install JACK.
  3. Build QJackCtl, a little application used to control JACK. (This step is optional, but makes things much easier; I won’t be providing the how-to for using the command line.)
  4. Use JACK’s alsa_in and alsa_out clients to give JACK access to the microphones and the loopback device.
  5. Use QJackCtl to connect the devices so that we can record both microphones at once.

We’ll also look at some extended and improved uses, including some potential fixes for real-time monitoring.

Create an ALSA loopback device
The ALSA loopback device is a feature of the kernel module snd-aloop. All you need to do is # modprobe snd-aloop and you’re good to go. Verify that the loopback device is present by checking for it in the output of aplay -l.

The loopback device is very straightforward: any input to a certain loopback device will be available as output on a different loopback device. ALSA devices are named by a type string (such as ‘hw’), followed by a colon, then a name or number identifying the audio card, a comma, and the device number inside the card. Optionally, there may be another comma and a subdevice number. Let’s take a look at some examples.

  • hw:1,0: a hardware device, card ID 1, device ID 0.
  • hw:Loopback,1,3: a hardware device, card name Loopback, device ID 1, sub-device ID 3.

For the loopback device, anything input to device ID 1 and a given sub-device ID n (that is, hw:Loopback,1,n) will be available as output on hw:Loopback,0,n, and vice versa. This will be important later.

Install JACK
You should be able to find JACK in your package manager2, along with Jack Rack. In Ubuntu and derivatives, the package names are ‘jackd’ and ‘jack-rack’.

Build QJackCtl
QJackCtl is a Qt5 application. To build it, you’ll need qt5 and some assorted libraries and header packages. I run Linux Mint; this is the set I had to install.

  • qt5-qmake
  • qt5-default
  • qtbase5-dev
  • libjack-jack2-dev
  • libqt5x11extras5-dev
  • qttools5-dev-tools

Once you’ve installed those, unpack the QJackCtl archive in its own directory, and run ./configure and make in that directory. The output to configure will tell you if you can’t continue, and should offer some guidance on what you’re missing. Once you’ve successfully built the application, run make install as root.

Run QJackCtl
Run qjackctl from a terminal. We should take note of one feature in particular in the status window. With JACK stopped, you’ll notice a green zero, followed by another zero in parentheses, beneath the ‘Stopped’ label. This is the XRUN counter, which counts up whenever JACK doesn’t have time to finish a task inside its latency settings.

Speaking of, open the settings window. Front and center, you’ll see three settings: sample rate, frames per period, and periods per buffer. Taken together, these settings control latency. You’ll probably want to set the sample rate to 48000, 48 kHz; that’s the standard for USB microphones, and saves some CPU time. For the moment, set frames per period to 4096 and periods per buffer to 2. These are safe settings, in my experience. We’ll start there and (maybe) reduce latency later.

Close the settings window and press the ‘Start’ button in QJackCtl. After a moment or two, JACK will start. Verify that it’s running without generating any XRUN notifications. If it is generating XRUNs, skip down to here and try some of the latency-reduction tips, then come back when you’re done.

Use JACK’s alsa_in and alsa_out clients to let JACK access devices
Now we begin to put everything together. As you’ll recall, our goal is to take our two (mono) microphones and link them together into one ALSA device. We’ll first use the alsa_in client to create JACK devices for our two microphones. The alsa_in client solves problem #2 for us: its whole raison d’être is to allow us to use several ALSA devices at once which may differ in sample rate or clock drift.

Now, it’s time to plug in your microphones. Do so, and run arecord -l. You’ll see output something like this.

$ arecord -l
**** List of CAPTURE Hardware Devices ****
card 0: PCH [HDA Intel PCH], device 0: ALC295 Analog [ALC295 Analog]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: Audio [CAD Audio], device 0: USB Audio [USB Audio]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 2: Snowball [Blue Snowball], device 0: USB Audio [USB Audio]
  Subdevices: 0/1
  Subdevice #0: subdevice #0

This lists all the currently available capture hardware devices plugged into your system. Besides the first entry, the integrated microphone on my laptop, I have hw:1 or hw:Audio, the CAD Audio U37, and hw:2 or hw:Snowball, the Blue Snowball.

Next, set up alsa_in clients so JACK can access the microphones.

$ alsa_in -j snow -d hw:Snowball -c 1 -p 4096 -n 2 &
$ alsa_in -j cad -d hw:Audio -c 1 -p 4096 -n 2 &

Let’s go through the options. -j defines the label JACK will use for the microphone; make it something descriptive. -d declares which ALSA device JACK will open. -c declares the number of channels JACK will attempt to open.

On to the last two options: like the JACK settings above, -p defines the number of frames per period, and -n defines the number of periods per buffer. The documentation for alsa_in suggests that the total frames per buffer (frames per period multiplied by period) should be greater than or equal to JACK’s total frames per buffer.

Next, set up an alsa_out client for the ALSA loopback device.

$ alsa_out -j loop -d hw:Loopback,1,0 -p 4096 -n 2 &

The arguments here are the same as the arguments above.

Use QJackCtl to hook everything up
Now, we’re almost done. Go back to QJackCtl and open the Connect window. You should see a list of inputs on the left and a list of outputs on the right. Your inputs should include your two microphones, with the names you provided in your -j arguments. Your outputs should include system, which is linked to your system’s audio output, and an output named ‘loop’, the ALSA loopback device.

Assuming you have mono microphones, what you want to do is this: expand a microphone and highlight its input channel. Then, highlight the system output and hit ‘connect’ at the bottom of the window. This will connect the input channel to the left and right channels of your system audio output. At this point, you should be able to hear the microphone input through your system audio output. (I would recommend headphones.) The latency will be very high, but we’ll see about correcting that later.

If the audio output contains unexpected buzzing or clicking, your computer can’t keep up with the latency settings you have selected3. Skip ahead to the latency reduction settings. That said, your system should be able to keep up with the 4096/2 settings; they’re something of a worst-case scenario.

If the audio output is good, disconnect the microphones from the system output. Then, connect one microphone’s input to loop’s left channel, and one microphone input to loop’s right channel. Open Audacity, set the recording input to Loopback,04, and start recording. You should see audio from your microphones coming in on the left and right channel. Once you’re finished recording, you can split the stereo track into two mono tracks for individual editing, and there you have it: two USB microphones plugged directly into your computer, recording as one.

Recording more than two channels
Using Jack Rack, you can record as many channels as your hardware allows. Open Jack Rack, and using the ‘Channels’ menu item under the ‘Rack’ menu, set the number of channels you would like to record. In QJackCtl’s connections window, there will be a jackrack device with the appropriate number of I/O channels.

In Audacity, you can change the recording mode from ALSA to JACK, then select the jackrack device, setting the channel count to the correct number. When you record, you will record that many channels.

Jack Rack is, as the name suggests, an effects rack. You can download LADSPA plugins to apply various effects to your inputs and outputs. An amplifier, for instance, would give you volume control per input, which is useful in multi-microphone situations.

Reduce frames per period
If you’re satisfied with recording-only, or if you have some other means of monitoring, you can stop reading here. If, like me, you want to monitoring through your new Linux digital audio workstation, read on.

The first step is to start reducing the frames per period setting in JACK, and correspondingly in the alsa_in and alsa_out devices. If you can get down to 512 frames/2 periods without JACK xruns, you can probably call it a day. Note that Linux is a little casual with IRQ assignments and other latency-impacting decisions; what works one day may not work the next.

You can also try using lower frames per period settings, and higher periods per buffer settings, like 256/3 or 512/3. This may work for you, but didn’t work for me.

If you come to an acceptable monitoring latency, congratulations! You’re finished. If not, read on.


Fixing latency problems
Below, I provide three potential latency-reducing tactics, in increasing order of difficulty. At the bottom of the article, just above the footnotes, is an all-in-one solution which sacrifices a bit of convenience for a great deal of ease of use. My recommendation, if you’ve made it this far, is that you skip the three potential tactics and go to the one which definitely will work.

Further latency reduction: run JACK in realtime mode
If JACK is installed, run sudo dpkg-reconfigure -p high jackd (or dpkg-reconfigure as root).

Verify that this created or updated the /etc/security/limits.d/audio.conf file. It should have lines granting the audio group (@audio) permission to run programs at real-time priorities up to 95, and lock an unlimited amount of memory. Reboot, set JACK to use realtime mode in QJackCtl’s setup panel, and start JACK. Try reducing your latency settings again, and see what happens.

Further latency reduction: enable threaded IRQs
Threaded IRQs are a Linux kernel feature which help deliver interrupt requests5 more quickly. This may help reduce latency. Open /etc/default/grub. Inside the quotation marks at the end of the line which starts with GRUB_CMDLINE_LINUX_DEFAULT, add threadirqs, and reboot.

Further latency reduction: run a low-latency or real-time kernel
If none of these help, you might try running a low-latency kernel. You can attempt to compile and use low-latency or real-time kernels; the Ubuntu Studio project provides them, and there are packages available for Debian. If you’ve come this far, though, I’d recommend the…

All-of-the-above solution: run an audio-focused Linux distribution
AV Linux is a Linux distribution focused on audio and video production. As such, it already employs the three tactics given above. It also includes a large amount of preinstalled, free, open-source AV software. It isn’t a daily driver distribution; rather, its foremost purpose is to be an audio or video workstation. It worked perfectly for me out of the box, and met my real-time monitoring and audio playback requirements for The Crossbox Podcast6. I recommend it wholeheartedly.

Given that my laptop is not primarily a podcast production device, I decided to carve a little 32gb partition out of the space at the end of my Windows partition, and installed AV Linux there. It records to my main Linux data partition instead of to the partition to which it is installed, and seems perfectly happy with this arrangement.

So am I. Anyway, thanks for reading. I hope it helped!


  1. Two identical microphones actually makes it slightly (though not insurmountably) harder, since they end up with the same ALSA name. 
  2. If you don’t have a package manager, you ought to be smart enough to know where to look. 
  3. This is most likely not because of your CPU, but rather because your Linux kernel does not have sufficient low-latency features to manage moving audio at the speeds we need it to move. 
  4. Remember, since we’re outputting to Loopback,1, that audio will be available for recording on Loopback,0. 
  5. Interrupt requests, or IRQs, are mechanisms by which hardware can interrupt a running program to run a special program known as an interrupt handler. Hardware sends interrupt requests to indicate that something has happened. Running them on independent threads improves the throughput, since more than one can happen at once, and, since they can be run on CPU cores not currently occupied, they interrupt other programs (like JACK) less frequently. 
  6. Expect us to hit our news countdown time cues a little more exactly, going forward. 

OpenTafl AI roundup: bugs and features

This post will cover OpenTafl AI changes since the last post I wrote on the topic, further back in the v0.4.x releases. First, bugs!

Let’s do some quick recap. OpenTafl’s AI is a standard, deterministic1 tree-searching AI, using iterative deepening. That means that OpenTafl searches the whole tree2 to depth 1, then to depth 2, then depth 3, and so on, until it no longer has enough time to finish a full search3.

You may have noticed, if you’ve used OpenTafl, that searching to depth n+1 takes a longer than searching to depth n, and that it’s altogether possible that the process above might leave OpenTafl with a lot of time left over. I did not fail to notice this, and I implemented a couple of additional searches (extension searches, as they’re known) to help fill in this time.

The first, I refer to as continuation search. Continuation search takes the existing tree and starts a new search at the root node, using the work already done and searching to depth n+1. Obviously, continuation search doesn’t expect to finish that search, but it will reach some new nodes and provide us some new information. After continuation search, OpenTafl does what I call a horizon search: it finds the leaf nodes corresponding to the current best-known children of the root node, then runs normal searches starting with the leaf nodes, to verify that there aren’t terrible consequences to a certain move lurking just behind the search horizon.

These are fairly easy concepts to understand, my poor explanations notwithstanding. The bugs I referred to in the title are more insidious. They come down to a much more complicated concept: what conditions must the children of a node meet for that node’s evaluation to be valid?

In the iterative deepening phase of the search, the answer doesn’t matter. Remember, OpenTafl discards any tree it doesn’t finish. When we’re doing extension searches, though, we don’t have that luxury. OpenTafl must be able to discern when a certain node has not been fully searched. I added a flag value to the search to note that a certain node has been left intentionally unvalued, which gets set whenever we have to abandon a search because we’ve run out of time. If a node did not exist in the tree prior to the extension search, and it has unvalued children, then it is also unvalued. If a node did exist in the tree prior to its extension search and it has unvalued children, this is okay! We ignore the unvalued children and use the information we’ve gained4. If an unvalued node is left in the tree after those steps, we ignore its value. Any unvalued node is misleading, and we should avoid using its value when deciding how to play.

This issue led to poor play, as both horizon and continuation search had a chance to introduce bad data into the tree. I finally tracked it down and fixed it in v0.4.4.6b.

After that, I came across another few bugs, lesser in severity but still quite bad for OpenTafl’s play: when evaluating a draw position for the attackers, OpenTafl would incorrectly view it as more defender-favorable than it should have been5. OpenTafl also had some trouble with repetitions, incorrectly failing to increment the repetitions table in some search situations. That’s one of the more important gains over v0.4.4.7b—v0.4.5.0b is absolutely incisive in playing out repetitions, as some of the players at playtaflonline.com discovered after the update.

Finally, a few minor time usage bugs are no longer present, although there are some headscratchers where the AI seems to lose half a second or so to some task I cannot locate, and some task it does not count when doing its time use accounting.

That about wraps up bugs. Features, as usual, are more fun.

First, OpenTafl now is willing to play for a draw in rare circumstances. If its evaluation tilts overwhelmingly toward the other side, and it sees a draw in its search tree, it evaluates the draw poorly, but better than a loss.

That depends on the second feature, which is an improved evaluation function. Rather than guess, I decided to be scientific about it: I built four OpenTafl variants, each with a certain evaluation coefficient raised above the rest. Those variants played each other in a battle royale, and based on the outcome, I picked new coefficients. They differ by size; 7×7 boards consider material more heavily, while larger boards prefer to play positionally6.

Positional play comes from the last and most important feature: piece square tables. Credit for the idea goes to Andreas Persson (on Twitter @apgamesdev), who linked me to the chessprogramming wiki article, and also provided a first pass at the tables.

I should back up a bit first, though. Piece square tables are descriptively-named tables which assign a value to having a certain piece type on a certain space. For instance, the space diagonally adjacent to a corner space in a corner-escape game is very important for the besiegers. That space gets a high positive coefficient. On the other hand, the spaces directly adjacent to the corners are quite bad for the attackers, and get a moderately large negative coefficient. OpenTafl’s evaluator handles the exact values.

The benefits of this approach are manifold: not only does OpenTafl know when the opponent is building a good shape, it now has a sense for position in a global sense. (It previously had some sense of position relative to other pieces, but that was not sufficient.) Because of this, it is much better now at picking moves which serve more than one purpose. If it can strengthen its shape by making a capture, it’ll do so. If it can weaken its opponent’s shape, so much the better. The code which generates the piece square tables can be found here7.

The outcome is most pleasing. I can no longer easily beat OpenTafl on 11×11 corner escape boards, and games in that family are presently the most popular in online play. Equal-time matches are all but a lost cause, and I have to engage my brain in a way I never did before if I allow myself more thinking time. Now, I am not all that good a player, and those who are better than me still handle OpenTafl pretty roughly, but it now plays at a low-intermediate level. Given that it barely even existed twelve months ago, I’d say that’s good progress.


  1. Mostly. 
  2. Kind of. 
  3. More or less. For being mostly deterministic, AI stuff is remarkably fuzzy. On an unrelated note, look! We have new footnotes, with links and everything! 
  4. You can construct a game tree where this assumption—that not searching all of the outcomes when exploring an already-valued node is acceptable—causes trouble, but the fault for that lies more with the evaluation function than with the search. In such cases, the evaluation function must be so incorrect as to evaluate a node which leads to a loss just over the horizon as better than a node which does not lead to an imminent loss to a slightly deeper depth. They are quite rare, though, and I haven’t come across one yet in testing. 
  5. It was supposed to be a plus sign, but it was a minus sign instead. Oops. 
  6. On the to-do list is changing coefficients over the course of a game—brandub is more or less an endgame study, and at some point, the evaluator should prefer material over position in endgames even on larger boards. 
  7. I chose to generate the tables because it’s easier to maintain and update. Work for corner-escape 11×11 boards generalizes to most corner escape variants; the same is true for edges. The only boards which really, truly take special cases are 7×7, since the corners are such a vast majority of the board, and moves which might be considered neutral or iffy on larger boards ought to be given a chance—there aren’t many options in the first place. 

2016 Tafl Efforts: Results and Roundup

First off: the inaugural OpenTafl Computer Tafl Open has come to a close. It was a bit of an anticlimax, I must admit, but fun times nevertheless.

To recap, only one entry (J.A.R.L) made it in on time. On January 2nd, I had the AIs run their matches, and it was all over inside of 20 minutes, with a bit of technical difficulty time to boot. You can find the game records here.

To move one layer deeper into the recap, both AIs won one game each out of the match. J.A.R.L won in 22 moves, OpenTafl won in 18, giving the victory to OpenTafl. Disappointingly for me, OpenTafl played quite poorly in its stint as the attackers, allowing J.A.R.L to quickly set up a strong structure funneling its king to the bottom right of the board. Disappointingly for Jono, J.A.R.L seemed to go off the rails when it played the attacking side, leaving open ranks and files and leaving a certain victory for OpenTafl. Deeper analysis is coming, although, not being a great player myself, I can’t offer too much insight. (Especially given that neither AI played especially well.)

I do expect that, when Jono finishes fixing J.A.R.L, it’ll be stronger than OpenTafl is today. He intends on making its source code available in the coming year, as a starting point for further AI development. (If feasible, I hope to steal his distance-to-the-corner evaluation.)

There will be a 2017 OpenTafl Computer Tafl Open, with the same rules and schedule. I’ll be creating a page for it soon.

Next: progress on OpenTafl itself. It’s difficult to overstate how much has happened in the past year. Last January, OpenTafl was a very simple command-line program with none of the persistent-screen features it has today; it had no support for external AIs, no multiplayer, no notation or saved games, and a comparatively rudimentary built-in AI.

The first major change of the year was switching to Lanterna, and that enabled many of the following ones. Lanterna, the terminal graphics framework OpenTafl uses to render to the screen, allows for tons of fancy features the original, not-really-solution did not. Menus, for one. For another, a UI which makes sense for the complicated application OpenTafl was destined to become. Although it’s the easiest thing to overlook in this list of features, it’s the most foundational. Very few of the remaining items could have happened without it.

Next up: external AI support. In the early days, I only planned for OpenTafl to be a fun little toy. At the end of that plan, it might have been something I could use to play my weekly (… well, kind of) tafl game without having to deal with a web interface. (For what it’s worth, Tuireann’s playtaflonline.com renders that goal obsolete, unless you really like OpenTafl.)

Later on, as I got into work on OpenTafl’s built-in AI, I realized what an amazing object of mathematical interest it is, and that it has not, to date, seen anything like the kind of study it richly deserves. As such, I decided I wanted OpenTafl to be a host for that sort of study. Much of what we know about chess, go, and other historical abstract strategy games comes from the enormous corpus of games played. That corpus does not yet exist for tafl games, the amazing efforts of people like Aage Nielsen and Tuireann notwithstanding. The quickest way to develop a good corpus is to play lots of games between good AIs. Good AIs are hard to come by if every AI author also needs to build a UI and a host.

So, OpenTafl fills the void: by implementing OpenTafl’s straightforward engine protocol, AI authors suddenly gain access to a broad spectrum of opponents. To start with, they can play their AI against all other AIs implementing the protocol, any interested human with a copy of OpenTafl, and possibly even the tafl mavens at playtaflonline.com. Not only that, but the AI selfplay mode allows AI authors to verify progress, a critical part of the development process.

Multiplayer was an obvious extension, although it hasn’t seen a great deal of use. (There are, admittedly, better systems out there.) It proved to be relatively straightforward, and although there are some features I’d like to work out eventually (such as tournaments, a more permanent database, and a system for client-side latency tracking to allow for client-side correction of the received server clock stats), I’m happy with it as it stands.

OpenTafl is also the first tafl tool to define a full specification for tafl notation, and the first to fully implement its specification. The Java files which parse OpenTafl notation to OpenTafl objects, and which turn OpenTafl objects into OpenTafl notation, are in the public domain, free for anyone to modify for their own AI projects, another major benefit.

In defining OpenTafl notation, I wanted to do two things: first, to craft a notation which is easily human-readable, in the tradition of chess notation; and second, to remain interoperable with previous tafl notation efforts, such as Damian Walker’s. The latter goal was trivial; OpenTafl notation is a superset of other tafl notations. The former goal was a little more difficult, and the rules notation is notably rather hard to sight-read unless you’re very familiar with it, but on balance, I think the notations people care about most—moves and games—are quite clear.

Having defined a notation and written code to parse and generate it, I was a hop, skip, and jump away from saved games. Shortly after, I moved on to replays and commentaries. Once again a first: OpenTafl is the first tool which can be used to view and edit annotations on game replays. Puzzles were another obvious addition. In 2017, I hope to release puzzles on a more or less regular basis.

Last and the opposite of least, the AI. Until the tournament revealed that J.A.R.L is on par with or better than OpenTafl, OpenTafl was the strongest tafl-playing program in existence. I’ve written lengthy posts on the AI in the past, and hope to come up with another one soon, talking about changes in v0.4.5.0b, which greatly improved OpenTafl’s play on large boards.

Finally, plans. 2017 will likely be a maintenance year for OpenTafl, since other personal projects demand my time. I may tackle some of the multiplayer features, and I’ll probably dabble in AI improvements, but 2017 will not resemble 2016 in pace of work. I hope to run a 2017 tafl tournament, especially since the engine protocol is now stable, along with OpenTafl itself. I may also explore creating a PPA for OpenTafl.

Anyway, there you have it: 2016 in review. Look for the AI post in the coming weeks.

Random Carrier Battles: what’s in the prototype, then?

Yesterday, we spoke briefly of what’s getting left out of Random Carrier Battles’ first playable prototype. Today, we’ll cover the happier side of that story: what’s in!

UI stuff
I have some informational interface tasks to take care of, to allow players to view task force members and elements of air groups. I figure to stick this on the left side of the main UI.

Some aircraft design improvements
I believe I’ll need to make some tweaks to aircraft and escort design, to specify quality of armament: the early use of the TBF Avenger was hampered by the poor quality of the Mark 13 air-launched torpedo, and I can’t capture that in the system as is. Similarly, British battlecruisers, German pocket battleships, and Yamato aren’t well-captured by the system as is. (Battlecruisers, in this framing, would be heavy cruisers with good guns; Scharnhorst would be battleships with poor guns, and Yamato would be a battleship with good guns.) Although surface combat is out of scope for the initial prototype, I want to have enough data to do a passable job at it when I come to it.

I may also have to make radios a feature of airplane design, so that types with historically good radios can communicate better than types with historically poor radios.

Aircraft handling: repair, fueling, arming, launching, recovery
Aircraft handling is a big focus of Random Carrier Battles: more than previous games in the carriers-at-war genre, I want to get down into the weeds. I want to track aircraft status to a fine-grained level of detail, down to how far along arming and fueling have progressed, or how warmed-up the engine is. On deck, I don’t think I plan to track exactly where planes are spotted, but I may do some tracking of takeoff run available—this would penalize light aircraft carriers with large air wings by preventing them from launching everything in one go, which is, in my view, a feature.

In terms of discrete development tasks, I’ll have to figure out how to turn a designed air group into an air group instance in the game world, build systems to hold air operations status and control transitions between air operations states, and build UI to control it all.

This feature will also lay the groundwork for land-based airfields, as well as seaplane tenders and seaplane-carrying cruisers.

Air combat!
Making this one heading is perhaps a bit ambitious on my part, but there you are. Air combat has a bevy of subordinate features, including representing armaments (to give damage) and ship and aircraft systems (to take damage), a planner for missions, and unit combat behavior AI.

Systems and armaments are the easiest of the bunch; they merely involve defining a set of systems for each class of asset, along with a set of armaments generated from the asset’s statistics and arming status.

The mission planner is a complicated feature, and one which I hope will be industry-leading: a central clearinghouse where admirals can view all missions currently planned or in progress, create new missions, cancel unlaunched missions, and eventually, handle every air operation in the task force. For now, it may fall to players to prepare the aircraft assigned to missions on their own initiative, depending on how the aircraft handling features shake out.

Finally, combat behavior AI: this is by far the biggest feature under this heading, and the hardest to handle. It includes automatic marshaling of air groups (players won’t have direct control over aircraft in flight), CAP behavior, scout plane behavior, strike planes’ flights to their targets, and attack behavior for dive bombers and torpedo bombers. Ships will also have to maneuver under direct attack (that is, to avoid incoming torpedoes, and to throw off dive bombers’ aim).

Initial spotting and scouting
Spotting and scouting in their fullness will require a lot of work, so I’m going to build a simpler system to start with. Simply put, you can see everything on your side, and anything within horizon range of your ships and planes.

Submarines will come later.

That’s that! I hope you find these plans as exciting as I do. I hope to get the demo to a state where I can take some usable screenshots and videos and submit to Steam Greenlight, at which point I’ll be hitting you up for upvotes.

Random Carrier Battles: the road to a playable prototype

Good afternoon, and happy Thanksgiving! While sitting here watching the turkey and the giblet broth, I had some time to work out a little roadmap for taking Random Carrier Battles from its current state, barely above proof of concept that the Godot engine is suitable for this purpose, to a playable prototype (if one that doesn’t capture my full vision).

So, to get the ugly out of the way first, let’s talk about what I’m leaving (for now) on the cutting room floor.

Wind and weather
Though they are crucial parts of aviation, they’re incredibly complicated, and I want to do them right the first time, rather than hacking something together now. With modern processors and multi-threading, I can push weather simulation into the background and only update every few in-game minutes, which leaves me lots of time to try interesting simulation techniques. ‘Interesting’, as I said, is a synonym for ‘hard’, and so I won’t be exploring these yet.

Land-based air
It may turn out that the mechanics of land-based air—launching and recovery—is a freebie based on doing carrier-based air. If it isn’t, though, I’ll tackle it later, along with design for land-based types like multi-engine bombers and flying boats.

Full visibility and spotting system
My plan for Random Carrier Battles is to attempt to capture just how blind carrier admirals were a lot of the time. Enemy positions will only be known by spotting reports, and allied air positions will only be known with full precision when they can be seen from friendly task forces. All of that will require a detailed system for spotting and visibility, and a system for displaying and archiving spotting reports. It’s less straightforward than it sounds, since the AI (when that arrives) will need access to that information for its fleet. Speaking of…

Artificial intelligence
I may provide some sort of rudimentary AI, but I may also leave it more or less entirely to scripting, or give the computer perfect knowledge. Don’t expect anything amazing, at any rate.

So, what does that leave to do? Nothing less than the core of the game. Come back tomorrow or Saturday for details!

Random Carrier Battles: kinematics and scale

I spent some time the other day playing the old-school DOS version of the current state of the art in carrier air warfare simulations, SSG’s 1992 classic appropriately entitled titled Carriers at War. As far as DOS-era wargames go, it’s pretty good—it doesn’t bother you with too many details, and it (largely) lets you focus on the grander strategy. I really blew the Battle of Midway as the Americans, though.

So, let’s talk about a way in which I hope to improve on the old classic: movement. Carriers at War plays out on a 20-mile hex grid; Random Carrier Battles currently tracks positions down to 10 meters; rather than a five-minute time step, I use a six-second timestep (organized into ten steps per one-minute turn) for movement and combat. This lets me do all sorts of fun things which 1992’s processing power did not allow, which I’ll get to shortly. It also causes me a great deal of trouble, which I’ll gripe about first.

The short version is, the kinematics are hard.

The slightly longer version is, there’s a lot of math involved in working out just how game entities ought to move. Warships aren’t much of a problem, because it turns out that warship maneuvering is pretty straightforward1. Aircraft, however, get a little tough. Not only do I have to consider everything I do with warships, I have to account for performance differences at altitude, as well as rates of climb and descent beyond which aircraft must either decelerate or accelerate. I don’t have the design fully worked out for that yet, I’m afraid, so I can’t say much more yet. Rest assured it’s complicated.

So, what does that enhanced positional and temporal resolution buy me above Carriers at War?

Better simulation of strike range
This is the biggest win, in my opinion. With such a high temporal and positional resolution, I can simulate fuel consumption to a much greater level of accuracy. As such, I don’t need to limit myself to Carriers at War’s fixed strike ranges2. The TBD, for instance, gets a with-torpedo range of 90 miles. I’ve seen other figures give a combat radius of 150 miles, and still others give a range (not radius) of 435 miles with a torpedo. By tracking fuel, I can, to some degree, ignore the trickier combat radius figures2, and simply grab a plausible cruise range figure. If I mix in some reasonable modifiers for speed, altitude, weight, climbing and descending, and maneuvering, suddenly I have a system which doesn’t need to work with combat radius at all. Players can launch strikes well beyond range if they want to; they just need to know that they’ll have to either deal with losing planes to fuel exhaustion, or follow the strike with their carriers.

Realistic combat behavior
The level of detail in kinematics, and the short time step, lets me make emergent some behaviors which might otherwise be the result of dice rolls. For instance, are Devastators running in on your carriers? Turn away from them, and the slothful American torpedo bombers will have to chase you, running their fuel down and exposing them to the depredations of your CAP and your escorts’ AA. Dive bombers rolling in on you? Throw the helm hard over to throw off their aim.

Many of these behaviors can be made to happen automatically: ships under dive bomb attack will make evasive turns on their own, for one. I haven’t yet decided which behaviors will end up being automatic, and which will be tactics set up by the player, but my aim is to do the low-hanging fruit for the player.

A notable exception to the above model is air combat: my current expectation is that the six-second combat step will prove too large for air combat (and relatedly, that emergent air combat behaviors will prove very complicated to code), and that the best way to handle it will be to put planes into a furball object inside which combat is handled in an abstract manner.

Exploration of unexplored formation options
Allowing the player relatively detailed control over formations, and keeping track of positions in similar detail, allows players to try some unusual tactical ideas. For example, the Japanese were not in possession of shipboard radars until fairly late in the game. What if, in some hypothetical battle, they detached some escorts from the main task force to make a search line a few miles toward the threat? Perhaps they could better direct their CAP to meet incoming threats.

That’s only one example. Undoubtedly there are others which haven’t occurred to me yet.

Those are at least a selection of the benefits of an approach with a greater focus on direct simulation, as opposed to a more traditional hex and counter approach. We’ll see how they turn out.

  1. At least to the fidelity I plan to simulate. There are lots of fascinating behaviors when you introduce multiple screws into the mix, but given that Random Carrier Battles is still, at its essence, a game of task forces, I don’t intend to allow players to give orders that detailed.
  2. The reason they’re so fiddly is that nobody ever talks about their assumptions: what load, exactly, constitutes a combat load? Is range deducted for reserve fuel and the time spent forming up? Are allowances made for maneuvering over the target? These are three of many questions left unacknowledged by most authors of military references.