The Science Behind Voice Privacy
The words we speak are precious. They convey our thoughts and opinions but also our vulnerabilities, insecurities and secrets. How can we ensure those inner feelings are protected when we need them to be? The answer lies in voice privacy: solutions that secure voice data over networks to make sure conversations are not interrupted, intercepted or recorded without permission.
Why is Voice Privacy Important?
In an increasingly digital world, voice privacy is a growing concern for voice technology users, which includes everyone with a smartphone or voice control device like Amazon’s Alexa or Apple’s Siri.
“When you think about how much spoken word contains sensitive information, whether it be medical consultations, a legal conversation, business, financial, whatever it might be — if that stuff was written down and leaked out somewhere, it’d be a big deal,” said Chris Meyer, director of product security and conferencing platforms at Shure.
Since the COVID-19 pandemic, the way we work has changed drastically. Now, more than ever we take calls, make deals and communicate online. “Not only is my voice at risk, but so is that proprietary information that’s being passed online,” said Joshua Curlett, CEO of Sound Productions.
Even outside the office, voice privacy is a huge concern. Voice cloning has become a big fear because hackers can now use technology to access voice-protected information including bank accounts. Artificial intelligence (AI) is opening up some interesting doors, even some dangerous ones.
With growing fears of sensitive information leaks, voice privacy solutions are increasingly important. Technology like sound masking and audio encryption are critical to making sure voice data is protected.
Sound masking is a voice privacy solution where the base level sound or “noise floor” is raised to fill a space, blanketing conversations in a comfortable level of sound. This solution is used in spaces to protect conversations from being overheard or recorded by others in that same space. These systems are emitting what might be called white noise to avoid overhearing conversations in places where soundproofing isn’t possible.
“It sounds kind of like airflow through the air conditioner,” said Ken Peck, office acoustics manager at AtlasIED. “So the background sound goes up, speech intelligibility goes down. I might be aware of a conversation, but I won’t be able to understand it.”
Sound masking can also eliminate distracting chatter, improving workplace productivity. Peck said it’s a much cheaper solution than soundproofing spaces and can be used in a variety of use cases — not just offices. For example, a doctor’s office might introduce sound masking technology to protect patient confidentiality and comfortability.
Sound masking uses spectrum generators to emit sound signals targeting the constants in someone’s voice, like the t’s and s’s to make speech unintelligible from a few meters away. One way to think about this is listening to a conversation when you’re washing the dishes; it’s much harder to hear what someone is saying from a room away because the noise of the water masks their speech. The noise emitted by the generator can also be tweaked by a digital signal processor to boost or lower the sound level to match the building.
“We can get in there and make fine tune adjustments that make the spectra, the sound that we put into an office, sound like it might be out in the middle of the Grand Canyon or up in the mountains; it’s a very smooth, natural sound,” said Peck.
Paul Hand, senior product manager for sound masking at Biamp, said the ultimate goal is for sound masking is for it to go unnoticed. He works to ensure these spaces are comfortable for the people working in them by balancing the sound masking technology with what the ear can pick up. “They don’t even realize and that’s the most important thing is we want to make sure that sound masking is effective,” he said.
Checking out this video about Biamp’s Qt X Sound Masking Processor.
Audio encryption is another technology used for voice privacy. Instead of protecting audio signals while they’re floating around in space, encryption protects voice data after it’s been captured. After the audio signals leave your mouth and are sent to a device, it’s important there isn’t a leak along the chain of transmission.
“From the time it’s spoken to the microphone capture, it’s sent down the wire to whatever the end device is, that needs to be protected,” said Meyer.
Encryption converts audio to a digital format, then takes that digital format and scrambles it using a set of secret keys. This scrambling is generally done by Advanced Encryption Standard (AES), which is a symmetric key encryption cipher or algorithm.
Companies like Shure use encryption algorithms in their products that take 24 bits of audio at a time and process the data with an encryption key that must then be used to decrypt the data after transmission. However, it is hard to do this at a speed that can keep up with the audio input.
“We’ve spent a lot of time optimizing our audio encryption algorithms to be low latency,” said Meyer. He also mentioned the importance of proper key management. It’s not the encryption algorithm that causes security breaches, it’s someone stealing the keys that are protecting that sensitive information.
Voice Privacy Trends
Voice privacy itself has become more mainstream and a much greater concern in general. This has led to increased adoption of encryption and sound masking. Peck said people are more often centering sound masking when designing a space like an office where privacy of voice data is important.
As sound masking is becoming more mainstream, Hand said ease of deployment has become a trend. Companies are developing sound masking solutions that are easy to install and adjust to match a variety of use cases.
Meyer said he has noticed more transparency with voice recognition technology like Siri or Alexa. These companies are more open about how they collect voice data and where they are storing it.
AI is, of course, another trend. Machine learning algorithms are being used in voice privacy, both to imitate someone’s voice, sparking security concerns, but also to detect deep fakes that are replicating the likeness of a person using digital media.
Voice privacy is also moving towards protecting concerns over protecting voice signals when it’s on the internet. “Encryption is a collective handshake between the audio and IT,” said Curlett. There has to be encryption on the IT side to make sure audio signals are protected over the internet in the case of a private video call.
It’s clear that voice privacy is becoming more important and will continue to be a focus in the AV industry. “This is exciting because as an industry this is where we can shine for folks,” he concluded.