Voice-recognition software in the enterprise. It’s the economy, stupid

Image: oneshutterbabe & Mike Russell - Flickr
Richard N. Block

This is more than simple call-routing. It is about putting a smile on the CFO’s face and $$ on the bottom-line

In the Netflix series, Mr. Robot, a lonesome police detective named DiPierro (Grace Gummer) lies in her apartment, the shush of city traffic through the window the only sound. She speaks into the dark room: “Alexa, when is the end of the world?” The response comes: a serene explanation of the most likely eventuality, that is, our sun expanding and enveloping the planet in a few billion years. It’s delivered by a pleasant female voice with a telltale robotic hitch.

Of course, many of us now know Alexa well. It’s a voice-recognition-based personal assistant, the brains behind the Amazon Echo family of smart devices. It can control some other devices, play music, answer questions, and help owners organize.

Google has recently thrown its hat in the ring with the release of Google Home, a similar device.

Echo has been in the space for less than two years (less than one year outside the United States), but it has already wormed its way into many homes, countless daily lives, and our pop culture. (Admittedly, Mr. Robot is pretty tech-savvy and forward-looking, as TV shows go.)

iPhone users have had Siri for a while, so they’re more used to voice-recognition as a user interface (and the many quirks it still has). New users may be surprised by the range of things such systems can understand, interpret, and respond to.

Old hands may mutter “no, Siri, that is NOT what I meant” in their sleep. But business users may wonder: how much benefit could this technology bring to my organization, now or in the foreseeable future?

A hot-button topic just got hotter

If mega-investor Mary Meeker’s prediction is to be believed, the likely answer is “a lot.” Last year, she predicted that, by 2020, more than half of all searches will be made using speech or image search as the input method. She pointed to the success of consumer devices like the Echo as evidence that “people will do more talking to their computers and less typing on them,” as the Guardian writes.

Moreover, speech will become more and more useful as “Siri fails” are becoming rarer and rarer — as of 2015, Google reported an 8% word error rate in speech recognition, Microsoft reported 6.9% late last year, and Apple joins them in reporting that errors continue to fall quickly, increasing accuracy and satisfaction.

Let’s take a moment or two to examine how use-cases are already evolving;


Because all voices are subtly different — much like our fingerprints or retinas, which are routinely scanned to verify ID — and computers are getting increasingly accurate at identifying spoken words, it’s likely that adoption of voice-based authentication will grow.

It has the advantage of not requiring dedicated hardware like a fingerprint scanner.Nuanceoffers a solution now, and a case study on the company’s site explains how Banco Santander Mexico has deployed voice authentication for its automated phone system, getting rid of PINs, passwords, and security questions (and presumably saving user and representative time).

HSBC is also using similar technology, which it claims is able to authenticate users even when they have a cold. In an enterprise setting, because voice authentication is biometric and can theoretically be used with any device that has a microphone, it may free up IT help desks from inundation with forgotten-password service tickets.

Call center solutions

We’re all familiar with “call steering,” the process by which a computer uses voice input to route our calls to the appropriate end-point within the company’s phone system. This saves labor and makes the whole telephone customer-service process more streamlined from the company’s point of view.

Many companies, including InContact (and Nuance, again) offer a voice-based front-end to its voice-recognition engine that “routes callers to the correct destination without the need for complicated menu mazes.”

Custom applications

Developers offer made-to-order interactive voice response (IVR) applications today, and with the increasing sophistication of the backend software powering these apps, this space seems likely to experience growth in ideas, innovation, and competition.

In other news, Google has opened up its voice-recognition API to developers (billed on a pay-as-you-go basis by 15-second increments), which seems likely to increase the number of applications that make native use of voice input.

So: what can voice do for your business?

A lot. More and more, as technology improves, computing capacity grows, and precision increases. When a client asks a question, you should listen, and these kinds of maturing applications ensure you’ll be able to do just that, whenever you’re needed.

And of course, customers love being listened to; it makes them feel valued and want to spend more, and that will make us all a little happier.

Start typing and press Enter to search

A 100-year-old multi-party call, and a new calling to connect the world