The Crossbox Studio: multiple mic podcast recording for $60 per person

If you’re a Crossbox Podcast listener, you may have noticed that we sound pretty good. Now, granted, our1 diction is poor, and we’re still figuring out the whole hosting thing. Our voices, however, come through loud and clear, with minimal noise. While we’re recording, we monitor our audio in real time. Some people will tell you quality podcast recording with features like that takes a big investment.

They’re wrong.

The Crossbox Studio is proof. We connect two USB microphones to one computer, then mix them together in post production for maximum quality and control.

In this article, I’ll show you how you can build our recording setup, starting with microphones and accessories, and moving on to software. Let’s dive in.

Hardware

We’ll start with microphones. For high-quality recording, each host has to have a separate microphone. This is a huge help both during recording and editing; being able to edit each speaker individually provides a great deal more flexibility to whoever gets stuck with the task of editing2.

For The Crossbox Podcast, we use one Blue Snowball—too pricey to hit our goal—and one CAD Audio U37. As studio-style condenser microphones go, the U37 is extremely cheap. It comes in at a hair over $39, and the sound quality and sensitivity are superb. I recommend it wholeheartedly.

Next, we need to mount the microphones in such a way as to minimize the transmission of vibrations to the microphone. This means that the microphone won’t capture the sounds typing on a laptop keyboard or touching the table. First off, we’ll need a microphone boom. This one clamps to the table. You don’t need anything fancier3. To hold the microphone, you’ll want a shock mount. Shock mounts suspend the microphone in a web of elastic cord, which isolates it from vibration.

If your environment is poorly acoustically controlled (that is, if you get other sounds leaking in, or if you have a noisy furnace, say), you ought to look into dynamic microphones. (The Crossbox may switch in the future.) These Behringer units are well-reviewed. If you get XLR microphones like these, you’ll also need XLR-to-USB converters.

Lastly, you’ll need a pop filter. Clamping onto the spring arm, the pop filter prevents your plosives and sibilants4 from coming through overly loud.

Let’s put it all together. Clamp the boom arm to the table. Attach the shock mount to the threaded end. Expand the shock mount by squeezing the arms together, and put the microphone in the middle. Clamp the pop filter onto the boom arm, and move it so that it’s between you and the microphone.

Congratulations! You’ve completed the hardware setup. Now, let’s talk recording.

Software

Moving on, we’re going to follow a procedure I laid out in an earlier article. Using two USB microphones at once brings some added complexity to the table. If you want to read about why this is so, hit the link above for a deeper discussion. Here, we’re going to keep it simple and just talk about the solution.

First off, you’re going to need a decently quick laptop5. Memory isn’t important. What we want is raw processing power. The amount of processing power you have on tap determines how many individual microphones you can record from.

Next, you’re going to want a specialized operating system6. Go get the appropriately-named AV Linux. This article is written targeting AV Linux 2016.8.30. Later versions change the default audio setup, which may cause problems. Create a bootable USB stick containing the image—here’s a good how-to. Boot it and install it. If you don’t plan on using AV Linux for everyday tasks (I don’t), install it in a small partition. (As little as 20 gigabytes will do, with room to spare.) Later on, when recording, you can choose a directory for temporary files, which can be in your everyday partition7.

Let’s move on. Now we’re to the point where we can talk about recording tools. The Crossbox Podcast uses two separate tools in our process. First, we route our microphone inputs through Ardour. Ardour, a digital audio workstation program, is powerful enough to do the entire process on its own. That said, we only use it for plugins, and as a convenient way to adjust our microphone levels relative to one another. We then route the audio from Ardour to Audacity, which we use to record, make final adjustments, and add sound effects.

Setting up audio routing: JACK

Time for a quick refresher on audio in AV Linux. It starts with ALSA, the Linux hardware audio driver. AV Linux, along with many other audio-focused Linux distributions, uses JACK as its sound server. JACK focuses on low latency above all else, and AV Linux ships with a real-time kernel8 to help it along. The upshot is that two ALSA devices, like our USB microphones, can be connected to our computer, using JACK plugins to resample their input using the same clock to guarantee that they don’t go out of sync.

We’ll touch on how to set up and manage JACK later. For now, let’s briefly discuss the overall audio routing setup, in terms of the path the audio takes from the microphone to your hard drive.

First, we’re going to use some JACK utilities to set up JACK devices for each of our microphones. We’ll run audio from those JACK devices through Ardour for mixing, plugins, and volume control. Next, we’ll make a dummy JACK device which takes audio from Ardour and sends it through the ALSA loopback device on the input side. Finally, we’ll use Audacity to record audio from the ALSA loopback device output.

Setting up audio routing: microphone in

We’ll need a few scripts. (Or at least, we’ll want them to make our setup much more convenient.) Before that, we’ll need some information. First off, run the arecord -l command. You should see output sort of like this:

**** List of CAPTURE Hardware Devices ****
card 0: PCH [HDA Intel PCH], device 0: ALC295 Analog [ALC295 Analog]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

This tells me that my laptop currently has one recording device plugged in: card 0, device 0, the built-in microphone. With your USB microphones plugged in, you should see more lines starting with card and a number. For the example above, the address is hw:0,0; the first number is the card number, and the second is the device number.

For each microphone, create a file on your desktop and call it microphone<#>.sh, filling in some number for <#>9. In this file, paste the following script.

#!/bin/bash
alsa_in -j name -d hw:1 -c 1 -p 512 &
echo $! > ~/.name.pid

The first line tells Linux to execute the script with the bash shell.

The second line starts a JACK client based on an ALSA device. -j name gives the JACK device a human-readable name. (Use something memorable.) -d hw:1 tells JACK to create the JACK device based on the ALSA device hw:1. Fill in the appropriate device number. -c 1 tells JACK this is a mono device. Use -c 2 for stereo, if you have a stereo mic10. -p 512 controls buffer size for the microphone. 512 is a safe option. Don’t mess with it unless you know what you’re doing. The ampersand tells Linux to run the above program in the background.

The third line records the process ID for the microphone, so we can kill it later if need be. Change name.pid to use the name you used for -j name.

Setting up audio routing: final mix

Onward to the mix. If you look at the output to the aplay -l or arecord -l commands, you should see the ALSA Loopback devices.

card 0: Loopback [Loopback], device 0: Loopback PCM [Loopback PCM]
  Subdevices: 8/8
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
  Subdevice #7: subdevice #7
card 0: Loopback [Loopback], device 1: Loopback PCM [Loopback PCM]
  Subdevices: 8/8
  Subdevice #0: subdevice #0
  Subdevice #1: subdevice #1
  Subdevice #2: subdevice #2
  Subdevice #3: subdevice #3
  Subdevice #4: subdevice #4
  Subdevice #5: subdevice #5
  Subdevice #6: subdevice #6
  Subdevice #7: subdevice #7

Audio played out to a subdevice of playback device hw:Loopback,1 will be available as audio input on the corresponding subdevice of recording device hw:Loopback,0. That is, playing to hw:Loopback,1,0 will result in recordable input on hw:Loopback,0,0. We take advantage of this to record our final mix to Audacity. Make a script called loopback.sh.

#!/bin/bash
alsa_out -j loop -c 3 -d hw:Loopback,1,0 &
echo $! > ~/.loop.pid

The -c 3 option in the second line determines how many channels the loopback device will have. You need one loopback channel for each microphone channel you wish to record separately. Lastly, we’ll want a script to stop all of our audio devices. Make a new script called stopdevices.sh.

kill `cat ~/.name.pid`
kill `cat ~/.name.pid`
kill `cat ~/.loop.pid`

Replace .name.pid with the filenames from your microphone scripts. Running this script will stop the JACK ALSA clients, removing your devices.

Managing JACK with QJackCtl

By default, AVLinux starts QJackCtl at startup. It’s a little applet which will show up with the title ‘JACK Audio Connection Kit’. What you want to do is hit the Setup button to open the settings dialog, then change Frames/Period and Periods/Buffer to 256 and 2, respectively. That yields an audio latency of 10.7 milliseconds, which is close enough to real-time for podcasting work.

That’s all you need to do with QJackCtl. You should also, however, pay attention to the numbers listed, at system start, as 0 (0). Those numbers will increase if you experience buffer overruns, sometimes called xruns. These occur when JACK is unable to process audio quickly enough to keep up in real time. Try using 256/3 or even 512/2, increasing the values until you get no xruns. (A very small number may be acceptable, but note that xruns will generally be audible in audio output as skips or crackles.)

Ensure QJackCtl is running before starting Ardour. Also, connect your microphones and run your microphone scripts.

Mixing with Ardour

Ardour is a free, open-source digital audio workstation application. It is ridiculously full-featured, and could easily do everything needed for a podcast and more. Since we have an established workflow with Audacity as our final editing tool, we use Ardour as a mixing board. In the Crossbox studio, Ardour takes input from two (or more) microphones whose input arrives through JACK, evens out recording levels, and runs output to a single JACK device corresponding to the ALSA loopback device. We then record the ALSA loopback device, which has a separate channel for each microphone we’re recording11.

How do we set Ardour to do this? It turns out that it’s complicated. Start Ardour and make a new session. (Since we’re using Ardour as a mixing board rather than a recording tool, we’ll reuse this session every time we want to record something.) For each microphone, make a new track. (That’s Shift+Ctrl+N, or Tracks->Add a new track or bus.)

Once you’ve done that, select the ‘Mixer’ button on the top bar. You should see a column for each of your tracks. You can use these to adjust volumes individually; you can also apply plugins or filters to each one.

Open up the Audio Connections window (under the Window menu, or by hitting Alt-P). We’ll want to do three things here.

Connect microphones to tracks

On the left side of the Audio Connections window, select Other as the source. (All devices which use the alsa_in and alsa_out JACK devices show up in the Other tab.) On the bottom of the Audio Connections window, select Ardour Tracks as the destination.

Connect each microphone to its track by clicking on the cell where its row and column intersect. You’ll see a green dot show up. Now the microphones are connected to Ardour tracks, and we don’t need to worry about microphone hardware anymore.

Connect microphone tracks to loopback device

Select Ardour Tracks as the source and Other as the destination. Connect each microphone track to one channel of the loopback device. (If recording in stereo, each microphone track channel needs its own loopback channel. If recording in mono, connect the left and right channels from one microphone to one loopback channel.)

Audio from the microphone tracks will now be routed to the ALSA loopback device, where we can record it with Audacity.

Connect microphone tracks to Ardour monitor bus

Select Ardour Tracks as the source and Ardour Busses as the destination. Connect each microphone to the Master bus. (Whether recording in stereo or mono, connect the left channel of each track to the Master left channel, and the right channel of each track to the Master right channel.)

By default, Ardour connects the Master bus to the system audio output. When you connect your microphone tracks to the Master bus, you should be able to hear yourself in headphones attached to your headphone jack. If you’re connecting more than two sets of headphones, you may need to get yourself an amplifier. This one seems nice enough. If you don’t have 1/4-inch headphones, you can use these converters.

Recording with Audacity

One more piece to the puzzle. Open Audacity. Select ALSA as the sound framework. Select the Loopback: PCM(hw:0,0) device. When recording, audio from one microphone should show up in each Audacity channel.

Adjusting hardware volumes

In AVLinux, you can use the applications Volti or Audio Mixer to provide a GUI to adjust hardware volumes. Volti is a tray volume control; right-click on it to get a view of the mixer. In either tool, to adjust the input volume of a microphone, select it (either in the dropdown or the tab bar) and adjust its mic level. To adjust the monitor output volume, adjust the output volume for your built-in soundcard. To adjust the recording output volume, adjust the volumes for the Loopback device.

Podcast recording shopping list

And that’s that. You now have all the information you need to replicate our studio setup. Please feel free to leave questions in the comments; I’m not much good at this sort of thing, but I may be able to point you to someone who can help you out. Below, I’ve included a shopping list for your perusal.

Buy one

Per person (non-microphone)

Per person (condensers)

Per person (XLR dynamic mics)

XLR connections are the industry standard for microphones. If you’re planning to expand to a true mixing board, you’re probably best off getting XLR mics so you don’t have to buy new ones when you make the switch. On the other hand, you’ll need an XLR-to-USB interface for each microphone to connect it to your computer, which pushes the price up somewhat.

Per person (USB dynamic mics)

If, like the Crossbox, you’re unlikely ever to proceed past two hosts with USB microphones, you should look into USB dynamic microphones. Like the USB condenser microphones above, they plug directly into a computer, doing the digitization internally. They are, however, less future-proof.

Cost breakdown

  • USB dynamic microphone: $30
  • Shock mount: $10
  • Mic boom: $9
  • Pop filter: $8
  • Total: $57

  1. Okay, my. 
  2. That’s me. 
  3. We, however, clamp our mic booms to spare chairs at our broadcast table. This means we can bump the table without jostling the mount, which makes for much cleaner recordings given our typical amount of movement. 
  4. P, B, T, S, Z, etc. 
  5. I realize this pushes the price well above $70 per person, but I figure it’s reasonable to assume you probably have a laptop of acceptable specifications. 
  6. Yes, it’s possible to do low-latency monitoring and USB microphone resampling/synchronization with Windows and ASIO, or your Linux distribution of choice with a low-latency kernel, but (at least in the latter case) why on earth would you want to? 
  7. If this paragraph made no sense to you, try this how-to guide. In the partitioning step, you may have to choose your current partition and select ‘resize’, shrinking it to make a little bit of room for AV Linux. 
  8. For the uninitiated, it means that JACK is always able to take CPU time whenever it needs it with no waiting. 
  9. Or, if you like, call it something else. Makes no difference to me. 
  10. The recommended CAD U37 is a mono mic, but has stereo output. We run it with mono input. 
  11. The astute reader will note that this may impose a limit on the number of simultaneous channels you can record. That reader, being more astute than me, could probably tell you how many channels that is. I figure it’s at least eight, since ALSA supports 7.1 output. If you need more than eight, you should probably look into recording directly in Ardour. 

One thought on “The Crossbox Studio: multiple mic podcast recording for $60 per person

  1. Pingback: How-To: Two USB Mics, One Computer, JACK, and Audacity | The Soapbox

Leave a Reply