Talk to Me: Why Real Live Voice May Be the Coolest App of All

    by David Walsh, Founder Kandy

    With all the focus on AI and bots, and with all the research on Millennials and their obsession with texting, some may lose sight of the power of live human conversations and how voice may always prove to be the shortest distance between a problem and a solution.

    The growth of voice controlled assistants like Siri and Alexa is one leading indicator that the human voice is the most convenient way for humans to interact with their machines. 

    Last year, Amazon Echo evolved from a science experiment to an in-home phenomenon, with over seven million devices in households. Google Home launched in 2016 following on the stunning success of Echo, creating a legitimately huge market for voice-first devices.

    google home amazon echo
    But this is just the tip of the iceberg. In the near future, we’ll witness hundreds of millions of consumers interacting with machines using conversational voice in very intuitive ways. This goes way beyond “IVR” menus and into hundreds of more realms, beyond the obvious into more sophisticated Internet of Things human-machine relationships.  

    Today, using only your voice, you can unlock your car, play music, dim the lights, order Thai food delivery, plan your route to work, and catch up on the news. 

    We’re early in the cycle, and visionaries now are talking about virtual friends for the aging population, able to hang out with Mom during the day and share stories while also attending to her basic needs and organizing help from humans as necessary. Personalized home assistants with real voices and personalities will be hired “as a service” and conversational personal devices will remind you to eat less, move more, and “just breathe.” 

    A voice-first device is an always-on, intelligent piece of hardware where the primary interface is voice, and according to VoiceLabs, which provides Voice Experience Analytics, the most widely used Analytics service for Amazon Alexa and Google Home developers, while the hardware encapsulates the consumer experience, those form factors are nothing without intelligent software.

    “The artificial intelligence assistants guide consumers on what is possible and how to interact, provide a core set of capabilities, then hand-off experiences to third-party applications to extend the experience. The third-party applications take care of consumers and enable them to interact with the brands and services they know and love. Finally, the ecosystem services bolster the applications, and make the ecosystem flywheel turn to provide additional value to all.”

    • In 2015, there were 1.7 million voice-first devices shipped. In 2016, there were 6.5 million devices shipped.

    • In 2017, VoiceLabs predicts there will be 24.5 million devices shipped, leading to a total device footprint of 33 million voice-first devices in circulation.

    The market is competitive, with companies like IBM leading for years with their Watson cognitive computing engine, increasingly being connected to voice services for enterprises, for example enabling contact centers to serve consumers more efficiently – letting go traditional IVR which only frustrated consumers, and delivering a “humanized voice” solution which can help – up to a point. 

    With all the automation, the self-service menus, the “webification” of contact center and support applications, why is the human voice – human to human – still valuable?

    Sometimes – you just need to speak to another live human. And even if you don’t need to, you want to. Which is why visionary developers are not dismissing real-time voice conversations between human beings, and instead elegantly orchestrating a combination of voice-first and “human-first” applications. 

    Marchex recently published a provocative report on this topic, 2015 Click-To-Call Commerce Mobile Performance Report. This report focuses on innovations in e-commerce, another massive growth market, and predicts a surge of innovation in real live human interactions designed to help brands sell more products. 

    In this report, they write: 

    Why are calls more valuable than clicks? A call connects a merchant with a prospect in the flesh – or nearly in the flesh. There is no separation of miles and pages of inscrutable content and Internet ether. 

    The customer is right there, on the line, asking questions. Old-fashioned salesmanship has an opportunity to blossom. You can take the caller by the hand, and lead them down the sales journey path to that magical destination called The Close.

    What’s more, calls made from mobile phones are exciting because the customer is typically not tethered to a PC or laptop at home or work. They’re very likely on the move. They could be near your store. Or in your store! Mobile calls are goldmines, representing prospects who are thinking about your business, possibly within physical range of making a purchase and, most importantly, are available for old-fashioned person-to-person communication, persuasion and deal-closing.

    The art of persuasion, in real-time, through live human interaction. A personal connection, an emotional intelligence that has not yet been replicated in “bots” and may never be – because even as technology is evolving, so is the human mind and consciousness. (We who created the bots may always end up being slightly more clever than they are). 

    While marketers all understand and deploy “click-to-call” from mobile apps, there has been little data available to quantify how these voice calls convert into revenue.

    Data for their study comes from 24 million aggregated and anonymous phone calls analyzed over the last 18 months by Marchex Call DNA, part of the Marchex Call Analytics platform. (Marchex Call DNA scores and classifies phone calls automatically to provide marketers data on consumer intent, sales and audiences.)

    Citing a BIA/Kelsey Advisory forecast, 2015, Americans spent more than $1 trillion in click-to-call commerce. In 2016, there were approximately 93 billion consumer-to-business phone calls from smartphones, expected to grow to 162 billion by 2019.

    Those click-to-call sessions work in a variety of ways, but what brands are finding is that when the applications understand, based on analytics when to route an inbound voice call to an expert, consumers are thrilled. 

    A simple example is Watson’s handling of a voice call. Watson is able to respond immediately with accurate information (“your next available flight leaves at 3 PM”), and able to respond intuitively by routing that click-to-call to a live expert (“I’m connecting you to an agent who can provide more options”) if the bot discussion doesn’t go well. 

    This is not a new form of IVR. It is a new form of dispatching considerate, intelligent service at less cost to the business, and with a higher level of customer satisfaction. 

    By the time that live agent speaks with the other human, they have all the information in front of them, and “coaching tips” based on the tone of voice, and sensed level of frustration from that harried traveler who just wants to figure out how to get home in time for his kid’s lacrosse match. 

    Some voice applications using click-to-call mean small businesses can skip the IVR lines all together, and simply connect to a prospect for a friendly conversation, with that entrepreneur’s “art of persuasion” kicking in and closing business in minutes with a new customer without that customer ever having had to dial a phone number. 

    We cannot recommend reading this report strongly enough, as it provides real life data showing that the real live human voice as a part of our existence and e-commerce may, in fact, turn out to be the “killer app” even as billions of dollars are being poured into “the bots.” We’re not opposed to bots – in fact we have developed some incredible software enabling bots to advance contact centers and more. 

    But we’re also not opposed to the ongoing potential of the human voice and the humans who connect with each other to move through life conveniently – and connectedly.