SpaTial Audio

What is SpaTial Audio?

Spatial audio is audio which gives a sense of space beyond conventional stereo. It gives an idea of sound in a 3D space. Spatial audio primarily focuses on directionality, where it can allow a listener to get a sense of the direction a specific sound comes from. This can be used to cue a player as to the direction of an enemy in a video game, or allow a more immersive music listening experience with instruments placed around you.

There are three main configurations that most people have access to. These are commonly found in most consumer electronics.

Mono

Mono is a single channel of audio:

  • Phones

  • Portable Bluetooth Speakers

Stereo

Stereo has a channel for both ears:

  • Home Stereos

  • Headphones

5.1 & 7.1

5.1 & 7.1 have rear and surround speakers to increase immersion:

  • Home Theatre

  • Cinema

What are ambisonics?

Ambisonic audio is a surround format similar to 5.1 and 7.1 in its function, but very different in execution. Instead of the sound being split into different channels and sent directly to one speaker per channel - which is the case with common surround sound setups, the sound is given a position in 3d space which the system then decodes and assigns specific amounts of level to each speaker to play the sound in the 3d space.

This allows an Ambisonic system to become 'speaker-agnostic', meaning the system isn’t designed to work with any specific number of speakers and thus can grow with one's budget. It is recommended that at a minimum 4 speakers are used for horizontal localisation and 2 speakers are used for height.

Although many more speakers can be used for ambisonic playback, such as the example of a rig of 50 loudspeakers in the image (left) – at least 4 channels of audio are required for recording audio in ambisonic format. The two formats used in this worksheet are Furse-Malham (WXYZ) and ACN (WYZX). This only refers to what order the channels are arranged, important when working with different developers.

The University of York’s Audio Lab listening setup. Showing 50 speakers arranged in a dome configuration to allow granularity when moving sound around the user.


https://www.researchgate.net/figure/50-channel-spherical-loudspeaker-array-at-the-AudioLab-University-of-York_fig3_334103540

Neuman KU-100

What is Binaural Audio

Much like an Ambisonic system relying on coordinates to deliver a signal across all three dimensions, binaural audio does the same for portable listening devices such as headphones or earphones. This uses a set of filters obtained through use of Head Related Transfer Functions (HRTF). This takes into account Interaural Time Delay (ITD) where a signal hits one ear before the other, and Interaural Energy Difference (IED) where due to a sound travelling further will be quieter in one ear. It produces a set of filters to mimic the special cues we hear in everyday listening.

Listen to Binaural Examples:

  • In this first audio clip you can hear the sound of a seed shaker mixed with the relevant HRTFs to make it sound as if it’s to your right and slightly above you.

  • In that audio clip, the seed shaker is mixed with different HRTFs so that it sounds like it’s to your left and slightly below your shoulder.

  • Here, we have changed the HRTFs during the sound so that the seed shaker seems to move from top right to bottom left around the back of your head. This illustrates how binaural audio can be used to place sounds all around the listener.

Creating an immersive listening experience requires the use of both an Ambisonic encoder and a binaural decoder to produce a binaural spatial sound mix that can be listened to using headphones.

The DAW we will be using is Reaper as it is fairly cheap and very accessible to learn. It also allows up to 64 channels of audio per track, a function which can be useful for Ambisonics.

Objectives

  • Explain the structure of a Spatial Audio workflow

  • Evaluate different spatial audio formats and their applications

  • Apply a B-Format encoder to a non-spatial signal

  • Utilise spatial audio techniques to enhance musical material

Requirements

Reaper: https://www.reaper.fm/

Ambix Plugins + Preset Pack:

MCFX Plugins+ Preset Pack:

Sample Pack:

Step 1: Setting up channels

  • Open Reaper

  • Create three new tracks (Track -> Insert New Track) and name them

    1. Ambisonic Decoder

    2. B-Format Reverb

    3. B-Format Track

  • Open the 'routing options' by clicking on the 'route' button to the right of the channel fader in the mixer window. All tracks should have 4 ‘track channels’, deselect the 'Master send' tick box for the B-Format Track and B-Format Reverb.

We will now route the 4 channels of the ‘B-Format Track’ to the Ambisonic Decoder. 4. Go to the routing of the ‘B-Format Reverb’, 'Add new send' and select the 'Ambisonic Decoder'.

  • Now choose 'Audio 1/2' and make sure it is sending 'Multichannel source -> 4 Channels -> 1-4'.

  • Do the same for ‘B-Format Track’ but also add a second send to the B-Format Reverb track with the same ‘Multichannel Source’

Mental Checkpoint 1: Write down why you think we are sending 4 separate channels to the Ambisonic Decoder channel.

Step 2: Adding Effects

We will now add the Binaural Decoder to the Ambisonic Decoder track.

  • Select the Binaural Decoder Track in your mixer and press ‘Shift+F’ to bring up the ‘Add FX’ window.

  • Search for 'ambix_binaural_o1' and choose that.

This plugin takes an Ambisonic signal encoded in B-Format and spits out a binaural signal (2 channels corresponding to Left and Right) allowing the use of headphones.

  • When asked for a preset choose 'open -> KEMAR -> h1_v1_Octahedron_MaxRe'.

  • Now go to 'B-Format Track' and load a 'ambix_encoder_o1'. This allows you to move that track’s audio around in 3d space.

  • Load a sample from the sample pack onto the 'B-Format Track' (drag and drop from explorer) and move the yellow dot around while listening on headphones, write down on the activity sheet what you hear when you move the yellow ball.

  • Use the ‘azimuth move’ and ‘elevation move’ parameters to automate the movement of the ball.

Mental Checkpoint 2: Write what you think the ‘o1’ at the end of ‘ambix_encoder_o1’ refers to when thinking about Ambisonic formats?

We are now going to add a B-Format Reverb.

  • Go to the B-Format Reverb track and add a 'mcfx_convolver4’ by selecting the track and ‘view -> fx browser’ or ‘Shift+f’.

A small alteration is required for the reverb channels to be ordered correctly. The B-Format provided is organised in Furse-Malham whereas the binaural decoder works in ACN. This is a difference in first order Ambisonic channel ordering.

  • Add the 'ambix_converter_o1' after the convolver in the FX chain by adding the fx and dragging the plugin below the ‘mcfx_convolver4’.

  • Set both inputs to 'Furse-Malham' before continuing.

  • Load in a preset on the ‘mcfx_convolver4' and move the ball on the B-Format Track ‘enocder_01’, write what you can hear.

Step 3: Outputting a Binaural Mix

The image shows what should be the final layout for one track of audio.

  • Choose 'File -> Render' and after choosing a name and directory of the desktop, click 'Render 1 file'.

You have now outputted a Binaural Mix due to the Binaural Decoder being in the signal chain.

  • Disable the Binaural Decoder by shift+clicking it. Choose ‘File -> Render’ and setting your channel count to 4 in the 'Options -> Channels' setting in the 'Render to file' window.

  • Listen to the file from the sample pack that you used. Now listen to your Binaural version. Write down how the non-spatial loop sounds in comparison to the spatialised loop.

  • Extension: You can now duplicate the 'B-Format Track' to add more instruments in your music, try changing the parameters of the encoder to place each source in a different space.

Mental Checkpoint 3: Write 3 sources that would work well with spatial audio and why.