AV and Speech Technology

Using Speech Technology in the AV industry

This past January, Las Vegas hosted the Consumer Electronics Show where Speech Technology was the bright shining star and a definite win in the consumer market. With examples like Alexa, Siri, Cortana and Google Assistant, speech technology is sure to be a major player in the audiovisual industry soon.

What is speech technology?

According to Wikipedia, speech technology relates to the technologies designed to duplicate and respond to the human voice. They have many uses. These include aid to the voice-disabled, the hearing-disabled and the blind, along with communication with computers without a keyboard. They enhance game software and aid in marketing goods or services by telephone. The subject includes several subfields such as speech synthesis, speech recognition, speaker recognition, speaker verification, speech encoding and multimodal interaction.

Audiovisual Uses for Speech Technology

Along with other AV experts, I anticipate speech technology will also have many AV uses. For example, rather than typing instructions into a keyboard, we will soon merely speak our orders into a microphone. Soon, all our devices will evolve to offer voice-enabled services.

Additionally, imagine you are in a crowded and bustling room full of people. You attempt to give voice commands to your smartphone but to no avail. The microphone reception is just too garbled, and Siri calls your sister instead.  However, upcoming generations of users may no longer have this problem. New features will be available that will integrate audio and visual cues, interpreting both your voice commands in conjunction with the movement of your mouth. Essentially, your new device will “read” your lips while also “listening” to your words. And of course, this will greatly improve recognition accuracy, especially in the presence of noisy areas.

The Winter 2018 edition of Speech Technology magazine had an interesting article (p. 15) also reporting on the use of voice-powered application for the Internet of Things (IoT) realm. Simply put, IoT is the interconnection via the Internet of computing devices embedded in everyday objects, enabling them to send and receive data. Quoting from a Strategy Analytics report titled, “From Alexa to Industry: Opportunities for a Voice-Powered Internet of Things,” the article states, “Voice can be used to communicate and to control and with proper consideration of fitness for purpose, it can result in a more natural experience for the user. In some cases, voice presents a vital hands-free experience for users, an extra layer of security, and even lower build costs than expensive touchscreen panels.” From my perspective, this is a very interesting prospect for implementation and application.

Challenges for Implementation

Implementing speech technology is not without its challenges. Of course, like any new technology, cost is a major factor for most consumers. The previously mentioned virtual assistants (Siri, Alexa, Google Assistant and Cortana) are offered on a wide range of speakers ranging in cost from $50 – $300. This range has at least allowed these voice assistants to be main stream for consumers. However, these devices are limited for personal and entertainment uses only. While speech technology is much more accurate than a few years ago, accuracy is currently around 90 percent.  This still leaves too much wiggle room for errors, especially within business settings. And finally, the goal is to ultimately sound and feel like human interaction. Regretfully, the responding voices are still a bit robot-like. Again, for businesses hoping to mirror human interaction, a robot-like customer service encounter is not preferable.

Scratching the Surface

The Consumer Electronics Show is the world’s gathering place for all those who thrive on the business of consumer technologies. The CES website states it has served as the proving ground for innovators and breakthrough technologies for 50 years — the global stage where next-generation innovations are introduced to the marketplace.  Given that speech technology seemed to be a part of every exhibitor display, I am certain we have only scratched the surface of this technology’s everyday use.

Signing off,

Tony, the AV Guy

Tony Sprando

About Tony Sprando

Tony Sprando has worked in the AV industry since 1994. For the past 20 years, he has designed commercial AVL solutions, managed a design process team, on-site installations, and is well-known for his blogs and customer satisfaction reviews on Google and Reviewability. Connect with him on LinkedIn here: https://www.linkedin.com/in/tonysprando and Instgram https://www.instagram.com/tonytheavguy/