Today, at Apple’s WWDC 2022 keynote, the company announced that iOS 16 will allow users of modern iPhones to scan the shape of their ear to create more accurate spatial audio. Likely implemented as an HRTF, creating custom HRTFs for consumers was once impractical due to the need for advanced equipment, but advances in computer vision make the technology much more accessible.
When it comes to digital spatial audio, there is a limit to how accurate the sense of ‘position’ or ‘3D’ the audio can be without regard for the unique shape of the user’s head and ears.
Because everyone has a uniquely shaped head, and especially ears, elements of incoming sound from the real world bounce off your head and into your ears in different and very subtle ways. For example, if there is a sound behind you, the precise geometry of the folds in your ear will reflect the sound from that angle in a unique way. And when you hear sound coming to your ear in that specific way, you are attuned to understand that the source of the sound is behind you.
To create a highly accurate sense of digital spatial audio, you need a model that takes these factors into account so that the audio is mixed with the correct signals created by the unique shape of your head and ears.
Audiologists have mathematically described this phenomenon in a model known as a head-related transfer function (also known as an HRTF). Using an HRTF, digital audio can be modified to replicate the spatial audio signals unique to a person’s ear.
So while the math has been well studied and the technology to apply an HRTF in real-time is readily available today, there’s still one big problem: everyone needs their own custom HRTF. This means that each person’s ear is accurately measured, which is not easy without specialized equipment.
But now Apple says it will use advanced sensors in its latest iPhones so anyone can scan their head and ears and create a custom spatial audio profile from that data.
Apple isn’t the first company to offer custom HRTFs based on a computer vision model of the ear, but with it built into iOS, the technology is sure to be much more widespread than ever.
At its WWDC 2022 keynote, Apple announced the feature as part of its upcoming iOS 16 update due later this year. It works on iPhones with the TrueDepth camera system, including the iPhone 10 and later.
But only having an accurate model of the ear is not enough. Apple will need to have developed an automated process to simulate the way real sound would interact with the ear’s unique geometry. The company has not specifically said this will be based on an HRTF implementation, but it seems very likely because it is a known quantity in the spatial audio field.
Ultimately, this should result in more accurate digital spatial audio on iPhones (and very likely future Apple XR headsets). That means a sound 10 feet from your left ear will sound more like it should sound at that distance, making it easier to distinguish between a sound 2 feet from your left ear, for example.
This will fit well with the existing spatial audio capabilities of Apple products; especially when used with AirPods that can follow the movement of your head for a spatial audio experience that follows your head. Apple’s iOS and MacOS both support spatial audio out of the box, which is able to record standard audio and make it sound like it’s coming from speakers in your room (rather than into your head), accurately playing sound that’s specially was written for spatial audio, such as Dolby Atmos tracks on Apple Music.
And there is another potential benefit to this feature. If Apple allows users to download their own custom HRTF profile, they may be able to use it and use it on other devices (like on a VR headset, for example).