All about immersive audio and its impact on music production

For independent artists right now, there’s no hotter creative zone than immersive audio. Forget everything you know about just panning left and right; formats like Dolby Atmos and 360 Reality Audio are giving us a full 3D canvas—above, behind, and all around.

Immersive audio is the new frontier in music production that completely changes how your audience experiences your tracks, offering a level of depth and engagement that can seriously set your sound apart on major platforms.

Keep reading to learn how immersive audio developed, the main formats used in producing immersive audio, and how independent artists can choose the right method and record their own music in immersive for worldwide distribution.

What is immersive audio?

Immersive audio refers to a sound technology that creates a more realistic and three-dimensional auditory experience for listeners. It goes beyond traditional stereo or surround sound setups by using advanced audio processing and spatialization techniques to make sounds feel like they’re coming from specific directions and distances.

This creates a sense of immersion, where the listener feels surrounded by sound, much like they would experience in the real world. The current streaming economy has led to an increased interest in immersive audio. Proprioception, a potent cognitive skill, is significantly influenced by this novel music production approach, directly enhancing our spatial orientation. The outcome yields an unprecedented and unparalleled sonic experience.

The origins of immersive audio

Immersive audio is not the invention of a single individual, but rather the result of collective advancements in audio technology and research over time. The concept of creating a more immersive auditory experience dates back decades and involves contributions from various researchers, engineers, and companies.

Here’s an overview:

Early multi-channel experiments

1881: In Paris, Clément Ader demonstrates the first two channel system delivering live opera over telephone lines to paired receivers, pioneering stereophonic listening.
Early 1930s: British engineer Alan Blumlein patents a comprehensive binaural (stereophonic) system for records and film (which he called “binaural sound”), including techniques for stereo records and film, laying the groundwork for spatial audio.
Late 1930s / 1940s: Walt Disney develops Fantasound for the film Fantasia, an early (and complex) form of multi-channel surround sound that used up to four distinct audio channels and multiple speakers in select theaters.

Surround sound and advanced formats

1957: The Vortex Concerts at the Morrison Planetarium in San Francisco, organized by Henry Jacobs and Jordan Belson, are widely considered pioneering instances of multi-speaker surround sound usage.
Mid-1970s: Ambisonics is developed in the UK. This is a full-sphere surround sound format designed to capture and reproduce a three-dimensional sound field (horizontal and vertical) using a specialized microphone array and four-channel signal (B-format).
Late 1970s / 1980s – 1990s: From the late 1970s, Dolby Stereo Surround brought affordable multi‑channel sound (L, C, R, Surround) to mainstream cinemas, and in the early 1990s Dolby Digital 5.1 extended this with discrete six‑channel audio, marking major steps toward enveloping the listener.
1993: The MP3 standard is released, and along with other digital formats like AAC (1997), digital audio compression technologies advance, which later enables the streaming of complex immersive audio.

Modern 3D and object-based audio

2012: Dolby Atmos is introduced in commercial cinemas, marking a major leap to object-based audio. Instead of simply assigning sounds to fixed channels (like 5.1), it treats sound as movable “objects” that can be positioned precisely in a 3D space, including the use of overhead speakers (height channels).
2015: DTS:X, a competing object-based immersive audio technology, is introduced, also offering the ability to position sounds in 3D space and adapt to different speaker setups.
Mid-2010s – Present: Binaural Audio (using Head-Related Transfer Functions or HRTFs) sees a resurgence for creating convincing 3D audio over standard headphones, driven by the growth of Virtual Reality (VR), Augmented Reality (AR), and 360-degree video platforms like YouTube and Facebook, which also adopt Ambisonics as a default immersive format.

The impact of immersive audio on music production: 4 major changes

When you produce your music knowing that your recording is destined for an immersive audio format it opens the door to much more creativity. If you know from the outset that you are creating an immersive experience, you will not be limited by having to match your content to a stereo format further down the line. You can give your instruments space to breathe, and create an audio experience that leaves traditional recording formats far behind.

Immersive audio also allows musicians to tell stories through soundscapes. They can position elements like characters in a story, creating a dynamic and engaging auditory narrative. It is poised to revolutionize music distribution by granting artists the power to sculpt intricate sonic environments in three dimensions, fostering deeper emotional connections and interactive engagement, thereby redefining the very essence of how music is composed, produced, and experienced. By paving the way for an unprecedented level of creativity, interaction, and immersion, immersive sound is fundamentally altering the landscape of musical expression.

Immersive audio fundamentally changes the creative and technical approach to music production by leveraging the three-dimensional sound field.

1. Immersive audio unlocks creative potential and freedom

Production is no longer limited by the constraints of the two-channel stereo format. You can plan and produce the mix specifically for a 3D environment from the start, and the ability to place individual instruments in a full 360-degree sphere means they do not compete for space on a single horizontal plane, leading to a clearer, less cluttered mix that allows each instrument room to “breathe”.

2. Immersive audio enhances storytelling and narrative

Musicians can create intricate three-dimensional sound environments (“soundscapes”) to support a narrative, in which sound elements can be positioned as “characters in a story,” creating a dynamic, engaging, and geographically defined auditory experience.

3. Immersive audio redefines the listener experience

By enveloping the listener, the intricate, immersive sonic environment fosters a more profound emotional connection with the music. This sense of immersion and realism elevates listener engagement beyond passive listening.

4. Immersive audio redefines music production

Immersive audio ****is poised to revolutionize music distribution by becoming a standard format, granting artists the power to deliver their precise 3D mix to listeners. Today, even independent artists can produce their recordings in spatial audio.

With this 3D canvas in mind as the final format of their music, musicians and composers face a redefinition of the very essence of how their music is initially composed and arranged, not just how it is mixed.

More about Ambisonics

Ambisonics is a fundamentally different and more flexible approach to creating and reproducing a three-dimensional sound field compared to traditional channel-based systems.

In short, ambisonics is designed to capture sound from a full 360-degree sphere around the listener. This includes not only horizontal directions but also vertical directions, above and below.

It employs a mathematical framework called spherical harmonics to represent the spatial characteristics of sound. By using spherical harmonics, audio engineers can encode and decode sound information to and from a format that accurately represents the listener’s orientation within the soundfield.

Ambisonics recording captures a full 3D soundfield by using a single microphone array (like a tetrahedral or sphere shape) containing multiple capsules. This records the sound from all directions as “A-format” and is converted to “B-format” for mixing.

The 4 main advantages of Ambisonics

It’s compatible with various speaker setups without compromising the spatial accuracy of the audio.
Recordings can be captured in different formats, such as first-order (four channels), second-order (nine channels), and higher-order formats. The higher the order, the more precise the spatial representation of sound. This scalability allows for compatibility with different production and playback setups.
The core technology of Ambisonics is also free of patents, which has encouraged its widespread adoption and development across various software and hardware platforms. Last, ambisonics recordings provide audio engineers with the flexibility to manipulate sound placement during the post-production phase. This means that sound sources can be positioned and moved within the 360-degree soundfield after the initial recording, enhancing the creative possibilities. Even newer technology makes it possible for engineers to program the positioning and placement of sound sources automatically, according to specific triggers such as pitch or instrument.

More about Binaural

Binaural recording replicates the way sound naturally interacts with the human anatomy before reaching the eardrums, which is the key to our brain’s ability to localize sounds. This way, crucial spatial cues necessary for perceiving direction, distance, and height are accurately preserved.

In other words, binaural recording is designed to capture sound exactly as a listener’s own ears would perceive it. This is typically achieved by placing two omnidirectional microphones within the ear canals of a dummy head (or a simple baffle) to simulate the acoustic shadow and reflections caused by the head, shoulders, and outer ears (HRTF). When played back over headphones, these recordings deliver a powerful and hyper-realistic sense of space, causing sounds to appear to originate from distinct points outside of the listener’s head.

Binaural recording replicates human hearing, using two microphones spaced apart (often inside a dummy head) to mimic the distance and acoustic filtering of a human head and ears (Head-Related Transfer Function or HRTF).

The 4 main advantages of Binaural recording

Binaural audio offers the most realistic and compelling sense of three-dimensional space when played back over headphones, better than both standard stereo and Ambisonics.
Binaural recordings are inherently mixed for and ready to play on any standard pair of stereo headphones without the need for special decoders, calibration, or specific software.
Binaural recording inherently captures the subtle acoustic effects of the recording environment as they would reach a human listener, making the final product more “natural” and believable than if you mix individual sources using a digital panner in a non-binaural format.
The barrier to entry for quality binaural recording is relatively low – pretty much anyone can do it. Dedicated binaural microphones (which fit in the ears) can be simple and affordable, plugging directly into many portable recorders or even smartphones, making it accessible for independent artists and content creators on a budget.

More about Dolby Atmos

Dolby Atmos introduces a hybrid, object-based system that fundamentally changes how audio is mixed and played back, moving beyond the limitations of both Ambisonics and traditional channel-based formats.

Instead of creating a complete sound field or relying on fixed channels, Atmos allows sound mixers to treat individual sounds (like a voice, a siren, or an instrument) as discrete “audio objects” that can be precisely placed and moved anywhere within a three-dimensional space. This approach guarantees that the creator’s artistic intent is preserved and accurately reproduced across a wide range of devices, from cinemas to home theaters and headphones.

In other words, Atmos is designed to be a highly scalable and adaptive delivery format.

To deliver its full three-dimensional experience over standard stereo headphones, Dolby Atmos utilizes a binaural algorithm to convert the object-based mix data into a two-channel signal.

The 4 main advantages of recording in Dolby Atmos

Atmos is an object-based format, which means the mix is not tied to a specific speaker layout. This guarantees the creator’s artistic intent is maintained, regardless of the listening environment.
The object-based nature allows mixers to position a sound with pinpoint accuracy anywhere in the three-dimensional space (horizontal and vertical) and to automate its movement along a precise path. Unlike Ambisonics, which captures the entire soundfield from a single point, Atmos allows for the discrete, focused placement of elements like a single vocal track or a specific sound effect, offering unparalleled creative control over the soundstage.
Atmos fundamentally integrates the vertical dimension, which enables sounds to be placed or moved above the listener, dramatically increasing the realism of effects like rain, helicopters, or soaring music.
The Atmos format is a hybrid system that combines the flexibility of audio objects with the stability of traditional channel-based audio. Mixes typically use a “bed” for background ambience and less critical sounds, alongside up to 118 dynamic audio objects for key elements. This allows engineers to use familiar channel-based mixing techniques where appropriate while leveraging the advanced spatial capabilities of objects for the most impactful elements.

More about Sony 360 Reality Audio

360 Reality Audio uses Sony’s object-based 360 Spatial Sound technology. Like its competitors, it moves past the confines of stereo channels by treating each core sound element—vocals, individual instruments (piano, bass, guitar), and even live audience ambiance—as a distinct “audio object.”

These objects are meticulously mapped and placed within a 360-degree spherical sound field that completely surrounds the listener, including positions above and below, to recreate the experience of being physically present in a recording studio or at a live concert venue. The format is designed for high scalability and can be experienced on a variety of devices, requiring only a compatible streaming service and a simple pair of headphones for the binaural rendering, though it can also be optimized for certified speakers and soundbars.

You can find Sony 360 Reality Audio content on several major music streaming platforms.

360 Reality Audio uses an object-based spatial renderer to perform binaural synthesis, applying a Head-Related Transfer Function (HRTF) to the sound objects to recreate a 360-degree spherical sound field over standard stereo headphones.

The 4 main advantages of recording using Sony 360 Reality Audio

360RA places individual sounds (vocals, instruments, effects) in a full 360-degree spherical space around the listener. This includes positioning sounds above and below you, creating a highly realistic sense of being inside the music, whether at a live concert or in the studio.
It uses object-based mixing (similar to Dolby Atmos) where each element is treated as an independent “audio object” with specific location data. This ensures the artist’s intent—the precise placement and movement of every sound—is preserved and accurately rendered on your playback device.
Music recorded using 360RA can be enjoyed with nearly any pair of standard headphones via binaural rendering. You don’t necessarily need proprietary Sony hardware to hear the 3D effect, making it highly accessible for music streamers.
When using compatible Sony headphones and the companion app, the technology allows for a Personal Optimization feature. This involves analyzing the unique shape of your ears via a smartphone camera to create a personalized Human Resources Transfer Function (HRTF), which fine-tunes the 3D sound field for your specific anatomy, making the spatial effect even more convincing and realistic.

Immersive audio case study: JOY de ROSE

In 2024, MusicTeam® distributed a 360 Reality Audio album by JOY de ROSE. Touted as a transformative and groundbreaking immersive sonic landscape, this release lives up to all the hype, according to Kevin Woods, the immersive mixing engineer and Technical Coordinator/Pro-Audio at Sony, who worked on the project.

With no restrictions related to matching the sound to a stereo format, the team could go further than ever before to create a unique and mind-blowing audio landscape. They created the track knowing that it would be released in immersive, allowing them to push the boundaries from a technical and creative perspective.

Gone are the days of working in stereo (for 2 speakers) and then adjusting for immersive (16 speakers). Once the immersive mix was completed and mastered, they decided to keep the fold-downs rather than the initial stereo mixes.

How to create your own immersive sound recordings

Implementing immersive audio requires equipment and software, and often demands a learning curve for producers accustomed to traditional methods. As an artist, you can use Apple’s Logic 10.7 that enables the integration of Atmos. To use 360 Reality Audio music format, you can work with a Sony’s plugin software 360 WalkMix™ Creator. As long as you have headphones and a laptop, you can create immersive music.

Also, a company called Sound Particles has created an innovative plugin called Skydust. Use it directly in your DAW to create even greater depth to immersive sound. Take your mix to the next level with this cutting-edge tool that represents the next generation of emerging automated and dynamic immersive sound plugins.

Step-by-step guide to creating an immersive sound recording

Step One: Choose the right approach

As explained above, these are the two most common methods for DIY immersive audio focus on capturing spatial information differently.

Binaural recording is best for headphone listening. It creates an incredibly realistic 3D effect that works best when played back on standard stereo headphones. It is more easily accessible and affordable than Ambisonics.

Ambisonics is best for VR/360° video, multi-speaker playback. It provides a flexible 3D file that can be decoded to suit any speaker setup (like Dolby Atmos or 7.1) or rendered binaurally for headphones. It’s a bit more expensive to produce in Ambisonics, but it is still an option for independent artists.

Dolby Atmos is best for commercial music distribution on major streaming platforms. It is an industry-standard, object-based format that focuses on scalable delivery across a wide range of playback systems. It represents the highest level of professional immersion and broad market compatibility.

360 Reality Audio is best enjoyed on headphones and specific smart speakers, as it drastically enhances the perceived 3D space. It is supported by TIDAL, Deezer, and Amazon Music Unlimited.

Choose the method that suits you and your project best, and remember to think ahead to the future: what type of recording do you need for your career goals?

Step Two: Get your gear

You don’t need a professional studio to start. You just need some basic gear that is appropriate for the type of recording you will be making.

Binaural gear list:

In-ear binaural mics (worn like earbuds)
DIY dummy-head setup (two small, high-quality omnidirectional electret condenser capsules mounted on either side of a simple baffle or dummy head, like a mannequin head).
A portable digital audio recorder (like a Zoom or Tascam) with two inputs and low-noise preamps.

Ambisonics gear list:

A dedicated Ambisonic microphone (often called an A-format mic). These typically have four capsules arranged in a tetrahedron shape.
A multi-channel recorder with at least four matched inputs to capture the signal from all four capsules simultaneously.

Dolby Atmos gear list:

Multiple speakers to capture sound from different points in the space
A professional audio interface or monitoring controller to feed all the speakers
Access to Dolby Atmos software to process the object and channel metadata from your DAW into the speaker feed for monitoring. This renderer is integrated within the SkyTracks DAW available through MusicTeam.

360 Reality Audio gear list

360 WalkMix Creator™ Plugin. This is the essential piece of software developed by Sony/Audio Futures. It is a suite of plugins you insert into your DAW that allows you to treat your sound sources as “objects” and freely place them in a 360-degree spherical sound field (above, below, and around the listener).
A professional DAW that supports the necessary plugin formats and system requirements for the WalkMix Creator plugin (e.g., Avid Pro Tools Studio/Ultimate, etc.).
MacOS (10.15 or greater) or Windows 10/11 64-bit.

Still want to learn more? Check out our article on How to prepare your music for 360 Reality Audio and Dolby Atmos.

Distribute your immersive audio tracks music with MusicTeam®

At MusicTeam®, we support the distribution of immersive music in 360 Reality Audio and Dolby Atmos, making the process accessible to both independent artists and labels. Let us help you acquire the right ISRC for each mix type, so you can accurately track your metadata and earn what you deserve for each track.

Our all-in-one music platform is the online platform where you can create, distribute, register and manage the metadata for all your music. When you sign up, we help you deliver properly formatted spatial audio files (along with the stereo mix) to all major streaming services that support those formats.

From music registrations to music catalog management, we have you covered. Sign up today!

Published On: September 18th, 2023Categories: Music creation

Chloe Dagenais Founder

Chloe is the Founder and President of MusicTeam®, a self-serve platform revolutionizing music catalog management, registrations, project delivery and distribution for music makers. With a background in information systems, she specializes in building technology-driven solutions that help artists, musicians, engineers and rights holders manage their music more efficiently. Her work in digital music rights began during her master's thesis, where she developed the proof-of-concept for a streamlined music catalog management system, a project that laid the foundation for MusicTeam®. Recognizing the industry's need for better metadata accuracy and artist-driven solutions, she officially founded MusicTeam® in 2020 to empower music creators through self-serve tools. Beyond her role as Founder and President of MusicTeam®, Chloé actively contributes to industry discussions on metadata integrity, rights management, and the future of music technology.

See Full Bio