Artificial Intelligence and Its Impacts in the Audiovisual Industry

By André Atique

“We think artificial intelligence (AI) will have the greatest impact when everyone has access to it.” This statement is taken from the Google page on AI, but what does it mean?

Artificial intelligence is the science responsible for getting machines to act rationally, as people. Google briefly presents on its AI page questions about thought formulation, logical reasoning and AI construction issues that make analysis very subjective. The human mind is the reference for the development of an AI, however, its thought-making is not so logical. People may have different lines of thought depending on creation, social influence, studies, etc., and this does not mean that some “think” and others do not. Everyone thinks! So what should be evaluated is how the machine responds to input of information and not the “philosophical lines” (cognitive science and computational neuroscience) that is assumed to present the answer. In this way, one of the most appropriate ways of, well, thinking about this is that: “it does not matter how a machine thinks, but rather how it behaves.” That is, what matters is the final result and not how it was achieved.

It is part of artificial intelligence that the machine is able to obtain and treat in a rational way information from various inputs such as text, images and sound, among others. In order for the machines to have the correct data for decision making, it is fundamental to think not only of the capture, but of the treatment of pieces of information. To illustrate this treatment of information, there are concepts such as NLP (natural language processing), machine learning and deep learning.

NLP – When we speak of texts, if we analyze the sentences such as, “John will travel by plane. He is afraid of heights.” We associate fear of altitude with John because the plane does not feel fear, but a machine wouldn’t necessarily recognize this. For that to happen, it’s necessary to use natural language processing (NLP) that aims to bring a contextualization to the machine. This is widely used in Google Translate, for example. A word can change its meaning depending on the context. As words are corrected, Google Translate through NLP will “learn” to translate them according to context.

Machine Learning – Machine learning works by obtaining information through data inputs by the user, aiming to adapt and improve these activities. For example, when you rate movies on Netflix or even when you watch them, Netflix collections that information over time, and using machine learning can suggest movies that probably will not disappoint you.

Deep Learning – Deep learning is a type of machine learning that aims to make tangible abstract information such as static images, dynamic images, behavioral analysis of clients, process of detection of diseases and the like. For example, imagine that you just took a picture of your dog and did not put any identifier one it as a caption, tag, etc. In the “search” area of your device you type “golden retriever” and up comes the picture you took! This is an example of deep learning. The most impressive thing is that all of us, as regular technology users, are helping companies like Google, Apple, Facebook, etc, to make algorithms increasingly accurate without even realizing it. You have probably already filled out a registration form on a website where at the end of it came a test to ensure that you are not a robot. In this test you need to click on the images that show a truck, for example. In addition to actually proving you’re not a robot, you’re also helping Google make its deep learning algorithm more accurate.

The AV + AI Market

The audiovisual industry is everywhere. Large or small displays, simple or complex audio systems, local or global computer network structures, automation and Internet of Things (IoT) are examples of the industry present in corporations and homes. It is a trend that some types of hardware become a service; for example, Blu-Ray players have been replaced by streaming video services, CD players have been replaced by streaming audio services, in many cases, living room TVs have been displaced by increasing the use of mobile devices and videoconference CODECs are being replaced by BYOD (Bring Your Own Device) and software or cloud-based platforms, etc. There are several examples of how the audiovisual industry has migrated to network solutions; this is not just a trend in AV, but a consequence of the world technology market that going the same way.

With information being centralized in web services, users and companies have opened a direct channel for the constant exchange of information, thus establishing a great avenue for the AI. Mark Zuckerberg, founder of Facebook, in late 2016 posted a video where he controls his home with an AI assistant named Jarvis. This AI wizard interacts with Mark through a voice command using two communication paths, where the question and the answer can start from either him or Mark. In this video through Jarvis, Mark operates house curtains, lights, air conditioning, has access to his diary/calendar, news feed and even monitors and interacts with his daughter Max, teaching her Mandarin. This seems a bit far-fetched for today, but it’s representative of where the future is headed, with a close interplay between systems and users that will certainly come in the audiovisual industry. It is part of residential and corporate automation in addition to unified and simplified control, making the system capable of taking some actions without requiring user intervention according to local conditions. For example:

  • “If natural light is too bright, draw the curtains or lower the blinds,” or,
  • “If the natural light is bright enough, reduce the intensity of the artificial light,” or,
  • “If no one is present, turn off the lights.”

There are many more examples. What the AI is proposing is that not only tangible information be used, such as digital or analog signal inputs from sensors, but a level of user interaction. Imagine having integrated into your technological assets: all details of your musical preferences, with the system letting you know about shows and events; light scenes that adjust automatically according to your history of drives; relevant movie launches and streaming platforms based on your favorite movies; directs you to the best option for leisure activities for the day, according to not just your preferences, but also the weather, your schedule and traffic; your house having an animated, humorous avatar that you can interact by text message or voice, local or remote; and so on. With AI, the ways of control and understanding between a user and his/her technological devices are endless — although still mostly audiovisual based.

Speaking of system deployment, today all multimedia and automation design needs to undergo a customer needs analysis to determine system prerequisites. Within AVIXA standards, a programming phase where the needs analysis is performed and data collected is a fundamental part of the development of a good system before the later stages of the project. This helps keep the customer satisfied from the beginning of the system design. This whole process of analysis is fundamental and probably will never cease to exist, however the implementation of the system according to the needs of the client can undergo considerable changes. Currently, after the commissioning of the system, it’s normal to understand how a particular user will be using the system, adapting the programming to have the suite the user’s taste. With machine learning and AI, this adaptation can be made by the automation system itself, continuously. Satisfaction guaranteed!

Now, the avatar from Mark Zuckerberg’s house sounded rather surreal, did not it? Interacting with a machine at this level of intimacy sounds rather crazy. The film industry has actually already explored the theme with the movie “Her.” A lonely man named Mr. Theodore created an AI operating system to stop himself from feeling lonely. This system learned about Mr. Theodore, becoming the perfect ‘person’ for him. Of course, in the movie, Mr. Theodore fell in love with the AI, but the question at stake is: Can a computer reduce a person’s loneliness? Alan Turing (1912-1954), an influential British mathematician and scientist in the development of computer science, developed a behavioral test for a machine, where people and machines would begin to interact virtually with questions and answers. This test was called the Turing Test. The goal was not to give particular answers, but to make the person (judge) interacting with the computer be unsure whether he/she was interacting with a computer or a real person. In 2012, a machine was able to achieve this — the judge did not know if it was a computer or an actual human responding to the questions. There is a company called Gatebox that specializes in creating an animated character that works like a person in the house. The character wishes you good morning, talks about the time or weather forecast, interacts by voice and text, keeps you company — all this with a lot of personality. It even seems like you’re actually living with someone in the house.

Back at the beginning, we talked about the statement from Google’s AI page: “We think artificial intelligence (AI) will have the greatest impact when everyone has access to it.” AI aims to combine logic with personality. The audiovisual industry is already following this path — but what about you?

André Atique is passionate about technology and business unit development. He graduated with a degree in IT with business management from FATEC and did postgraduate work in project management from Mackenzie University, He began his career in the residential and corporate automation market in 2008, training resellers in technical skills and sales through academic training. He now works in business development, participating in the main projects of the audiovisual industry in Brazil, and serves as commercial manager in AW Digital, technology division of Athié Wohnrath.