Why AI Isn’t Taking Over AV

It’s always interesting to hear people’s take on the future of technology. There seems to be a fanaticism built around Moore’s Law, the Turing Test, and the ability of computers to become “conscious.” It usually ends in  a theory of technology not unlike the rise of Skynet’s machines in the Terminator movies.

Sam Maloof Woodworking - Chair 2009

If you’re not familiar with all the talk around these issues, ask Google about them and see what you find.

For brevity’s sake, I’ll include a couple definitions below.

Moore’s Law is the observation that the number of transistors in an integrated circuit doubles every two years, thereby doubling computer processing.

The Turing Test is a test for intelligence in a computer, requiring that a human being should be unable to distinguish the machine from another human being by using the replies to questions put to both.

Now I’ve talked a little about Moore’s Law before — I’m not disputing the accuracy of the transistor assertion, but more questioning the fact that merely doubling transistors makes things faster but doesn’t directly birth innovation.

As I stated before,

I can put one engineer with little creativity in a room and he will produce nothing new or innovative. I can then add three more engineers with little creativity into that same room, effectively quadrupling the “computing power” and still get nothing new. Just because the computing power quadruples, innovation isn’t spawned automatically. There needs to be a creative spark, typically spurred by asking a question in a new and unique way.

As for the Turing Test, a chatbot created in 2014 named Eugene supposedly passed it in 2014 by tricking people into believing human responses were computer generated while the computer responses were uniquely human. On the surface this may seem like an AI victory, that a machine was perceived as more human than actual humans, but if you dig a little you’ll notice that the Turing Tet has many skeptics, partly because the interactions are timed.

Why does a time limit affect the end result? Well, the main argument against it revolves around something called the mannequin effect. One mat very well bump up against a mannequin in a store believing it is a human and quickly apologize based on the brief interaction and a quick glance. However, the longer that interaction continues, the more apparent the nature of the mannequin becomes. The Turing Test times out that interaction, which favors the machine.

So as good as Alexa and Siri have become, and despite their first names and constant companionship, a conversation of any length will quickly reveal their bits and bytes. Even IBM’s Watson, arguably the most powerful and advanced AI engine around today, suffers from this same fate.

There is something unique to being “human.” Something beyond a simple accumulation of the data that we consume with our eyes and ears. I’m not arguing the metaphysical here, just the uniqueness.

I was in a chat room with some SMPTE engineers once discussing digital video and active vs. passive 3D, (yes, I need help) and one of those engineers made an eloquent statement that I wish I would’ve done a screen capture on.  To paraphrase, they said that,

“Somewhere in the back of the human mind, where neurons are firing to process all of those projected or backlit pixels 60 times a second, the brain perceives a difference between that digital stimulation and the actual reflection of light back to our cones and rods in a physical, naturally lit environment.”

That comment immediately rang true to me and has always been in the back of my mind when people say that VR  the real world will someday be indistinguishable from one another. No matter how deep the rabbit hole gets, I think that the brain, will on some level, always know that it’s in the Matrix, just like it discerns the difference between dreams and consciousness.

I also remember an experience that I had as an integrator when the firm I worked for was doing a job for an air and space museum centered around Robonaut 1 and the robotic DARPA arm, named Robbie. Part of our contract involved recording content consisting of interviews with the scientists working on these cutting edge technologies. I sat in the office with the man who was editing the content, so I heard countless hours of the content. The common theme was that it was impossible to teach robots “how to think.” They could create all sorts of logical problem programs and data analysis that utilized machine learning, but were nowhere close to critical thinking let alone consciousness.

There was an example of telling the robot to get a pencil. You can code the possible locations: desk, drawer, cabinet; you can scan multiple images of the pencil into the computer and give the robot a camera as an eye. But that simple task, go get a pencil, may still be very difficult for the robot to achieve. A person on the other hand can see a pile of papers and know that something may be hiding beneath or notice a laptop bag at the end of the desk and look inside the zippered pocket to reveal the prize.

For set tasks, technology with some machine learning and AI may very well be good enough, but for critical thinking it’s just not close yet. Machines and systems using AI need highly intelligent humans to write the if then loops they so desperately depend upon as well as to monitor them in case situations arise that just aren’t in that data base yet.

Alexa may be a great way to control your home, but in reality you’re doing nothing greater technologically than pressing the button you used to push on your control panel. “Alexa, turn the heat up to 75” is an easy thing to program. Data relating outside temperature to inside temperature may also be helpful for the machine to learn when you turn on the AC vs. the heat and at what temperatures you usually set to come up with a program for the Nest. On the flip side, getting Alexa to realize that when a user exercises in the morning, she should turn the AC down, and that when there’s a new baby she should turn the heat up is a different thing altogether, but an observing human understands those correlations immediately.

AI is great. Data is amazing. Machine learning is an incredible feat.

But there is still a human factor to interacting with the world that eludes them all.

I was on a podcast about the Samsung and Harman acquisition where an industry stalwart again promoted his platform that states people don’t want a rack full of hardware, they prefer simplified and unified devices.

I don’t necessarily agree or disagree. I believe people actually don’t care what the hardware is as long as it solves the problem. It could be a rack of gear or an app. It doesn’t matter either way. The customer isn’t buying either, they are buying the answer to their problem, regardless of the methodology used to provide it.

My issue was more with his assessment that because of this trend that integrators are overconfident in their value to the end customer.

Sure, the hardware may no longer need a lot of “integration” as four boxes have now been integrated into one, but the user experience itself as a direct result is not automatically good. The software may be advanced; the system may mic the room, play test tones and calibrate itself; the control software may learn how different users utilize the space.

That’s all well and good, but a computer doesn’t know what sounds “tinny” or that the CEO has a slight hearing loss that needs to be accounted for in certain frequencies. Systems with ambient noise mics that automatically adjust volume levels to achieve better signal to noise ratios only work to a degree. STI is still only measured by achieving even coverage and by evaluating RT60.

The point is that the science can be sound (no pun intended) but the user experience trumps all. The numbers all may add up in the GPU of a deep learning electronic brain, but the experience may still lack something.

Science says that a highly directional hypersonic speaker playing two separate sounds out of phase to create a net frequency within the range of human hearing should work. Reality says that it does work in generating the sound, but that the listener has an adverse reaction on a physiological level to hypersonic waves that gives them a feeling of uneasiness if left listening to them for too long. The computer and the human hear the same sound, but the human feels differently than when listening to a traditional speaker at that same frequency.

My point is that although our systems may get less complex, utilize more software and less hardware, and become easier to control and program, as well as “learn” how to be more efficient and relevant over time, they will never understand the human experience. There will always be value in the presence of a real person, trained in the technology, who knows how to address these issues and how to make sure that the AV environments we create are optimized to the actual users themselves with their unique and subjective experiences of the technology.

In order to fulfill the promise of exceptional experiences, we must be able to actually experience the effects of the systems we create. A machine cannot “experience” anything and I’m uncertain that they ever will be able to. The value of the integrator in the equation is not in connecting boxes or writing code, it is in their humanness, in their ability to experience the system once installed and in being able to empathize with the end-user to create something worth more than the out of the box solution.

I’d love to hear your thoughts in the comments section below.