Spatial audio promises immersive virtual experiences that engage more of the senses
Virtual reality’s goal is to fully immerse a person in a digital landscape, triggering the same kinds of physical and psychological reactions they would experience in the real world. In virtual reality (VR) parlance, this is called “presence”—a mental state in which people recall VR experiences as if they had actually occurred. Computer graphics have improved dramatically in recent years, and advances in haptic, or touch, VR technology are beginning to allow users to feel sensations such as temperature, pressure and vibrations. For VR to really take hold of a person, however, a dynamic soundscape is essential.
One of the most significant developments in VR sound is “spatial audio,” which is designed to mimic the pitch, volume, reverberation level and other audio cues the brain would expect during a real-world experience. “With our hearing we can sense what happens in those directions around us where we cannot see, [such as] car tires screeching behind us, and react—jump out of the way—without the need for visual input,” says Kai Havukainen, senior product manager for audio at Nokia Technologies, which is developing spatial audio technology for some of the company’s devices.
Spatial audio allows VR programmers to create content whose sounds can come from any direction, says Tom Smurdon, audio design manager for Oculus VR, which is owned by Facebook. “There’s wind in the trees above your head, there’s the sound of water coming from a river at your feet and now there’s sound from somebody sitting right next to you, whispering in your ear,” he says. “Everybody is still learning how powerful spatial audio can be.”
“Spatial audio, in its most basic form, emulates how we perceive sound in the real world,” says Brian Yessian, chief creative officer and partner of sound design company Yessian Music. “If we are in a room and someone in front of me is speaking and I turn my head to the left, then my right ear is going to pick up more of the voice and my left ear will [hear] less,” he explains. “Or if something falls behind me, I know to look back because our audio senses pick up on the fact that it was actually behind me.” Spatial audio achieves that effect in VR using software algorithms that manipulate a program’s sound wave frequencies, creating audio levels that become louder or softer depending on the user’s distance from a virtual object. The sound also shifts from one headphone speaker to the other as the person moves their head from side to side or as the virtual objects move on their own.
Makers of VR headsets have begun to embrace immersive soundscapes as a way to improve the quality of the virtual experience. The new Nokia 8 smartphone, for example, includes OZO Audio software that uses the device’s multiple microphones to record video that has surround-sound qualities when heard through headphones. Spatial sound is also a major feature of the Microsoft’s new Windows Mixed Reality head-mounted displays, which go on sale later this month. Mixed reality is essentially a hybrid of VR and augmented reality that relies on cameras and other sensors to integrate digital objects into the real world or real objects to be embedded in a virtual world.
Microsoft already makes a mixed-reality headset with spatial audio called the HoloLens, but that device is targeted at software developers and corporate users. The HoloLens price also starts at $3,000, several times the cost of the consumer-targeted Mixed Reality headset. Microsoft has also made spatial audio part of Windows 10, which means any device running that version of the operating system will be able to play dynamic sound when headphones are used. The new Windows Mixed Reality and HoloLens headsets offer very different user experiences. The HoloLens has a transparent display that projects vivid holograms on top of real-world environments and is a fully self-contained computer that allows the user to move around without being restricted by cables. The new Mixed Reality headset plugs into a PC and has an opaque display similar to that of the Oculus Rift and HTC Vive.
Microsoft developed its spatial audio algorithms after studying different ear shapes and the way in which a person’s brain via the inner and outer ears locates the source of a sound in three dimensions. The goal was to better understand how they affect a person’s sound perception. “We’ve also done a lot of work in modeling how audio bounces and reverberates off the environment,” says Noel Cross, Microsoft principal software engineering lead, who has worked on both the HoloLens and the new Mixed Reality devices. “Different size rooms give you different levels of comfort as a human, and if things don’t match your expectations in terms of [what] they should sound like, you instinctively feel quite uncomfortable.”
Cross and his team took those lessons into account when designing the interface for Microsoft’s new consumer headset. Instead of presenting users with a start menu or screen when they don the device, they begin their experience in a virtual house located on a cliff by the sea, where they are then free to wander its three different floors. Users interact with that setting much the same way they would stare at pictures or TV screens in the rooms of a real house—and for that to make sense, Cross says the developers had to get the sound element just right. Oculus likewise has its users begin in the same location—sort of like a VR home page—each time they use the device.
The Sweet Spot
Earlier efforts at “3-D audio”—a precursor to spatial audio—were very limited in that they were designed for PCs in the 1990s, Cross says. The speakers were typically located on either side of the computer monitor, and the sound was immersive if the users sat in a “sweet-spot zone in the middle” and did not move, he says. “Even with headphones in earlier experiences, as you moved your head, the sounds wouldn’t stay locked in the context of the world but would stay fixed relative to your head,” Cross adds.
Given that humans are hardwired to pay attention to sound and instinctively use it to map their surroundings, find points of interest and assess potential danger, it is hard to overstate the usefulness of spatial audio for VR. As virtual environments move to the mainstream in education, training and health care—including the treatment of phobias and trauma via virtual reality exposure therapy—users will want to fully engage their senses.
This article was originally published by SCIENTIFIC AMERICAN.