A version of this essay was originally published at Tech.pinions, a website dedicated to informed opinions, insight and perspective on the tech industry.
The idea of talking conversationally to computers has been a long time in the works. Science fiction is so often a self-fulfilling prophecy, as it provides a vision for humans to chase after with technological innovation. For those of us who have watched voice-based computer interactions evolve, we have seen it go through many manifestations as it grew up. We now find ourselves in a world where using voice to interface with a computer is commonplace on a regular basis for the masses. While I’m not quite confident that we have reached an inflection point, I am confident we are at least on the cusp of one with voice-based user interfaces and the vision of the Hal 9000 (The AI assistant of Arthur C. Clarke’s "Space Odyssey" series) and Jarvis (the voice-based AI assistant of "Iron Man").
Voice looks to be a natural extension of our keyboard/mouse/touch-based input and output methods. And consumers seem to recognize its value, and want it to work in more ways.
In anticipation of this and the many other "voice-first" based products and experiences we believe will come to market in 2016-2017, we sought to do a quantitative study of Amazon’s Echo, Apple’s Siri and Google’s OK Google. We conducted two separate studies in early May, since our intuition told us that voice would be a major theme of Google I/O and at Apple’s upcoming WWDC.
We focused the Amazon Echo study on our early-adopter panel, since we knew we would not get a statistically significant number of Echo owners in our mainstream representative U.S. consumer panel. We collaborated on the Amazon Echo study with my friend Aaron Suplizio from Experian. Experian is also studying how the Echo is being used, specifically in the context of conversational commerce. (Experian didn’t pay us to do the study, but did cover the costs for the raffle, in which two respondents won a $100 gift card.)
The second study was focused on our mainstream consumers, to understand how they use Siri and OK Google (or any Google voice-based search technique) to better learn how both are used and what the overall perception of each is by mainstream consumers. I’ll start by sharing what we learned about the Amazon Echo.
Amazon Echo and the voice-first user interface
By spreading our study across 1,300 early adopters, we found that 13.86 percent of the panel owned an Amazon Echo. It came as no surprise to us that the overwhelming majority of Echo owners also owned an iPhone (83.72 percent), as iPhone owners at large tend to show more early-adopter tendencies versus Android owners. What was most enlightening, in contrast to the Siri and Google voice study, was how different usage of the Echo was versus Siri and OK Google. This was interesting both in terms of location of usage and also most common tasks.
We first wanted to understand where the Echo is used in most consumers’ homes (we had a hunch it was either the kitchen or the living room). As you can see, the kitchen has the edge on the living room, with 51 percent of consumers saying they have their Echo in the kitchen.
Given the type of things the Echo does, and perhaps in alignment with Amazon’s goals in delivering services to consumers via the Echo, knowing the primary usage room is important, particularly because it is likely that the things we ask of our voice assistants may vary based on the context of the room or physical location we are in. For example, asking the Echo to turn the TV on is less relevant as a primary task unless the Echo is in the living room. We can certainly make the case that voice assistants will someday be available at all times in all rooms.
We followed this question by asking respondents to choose the top two things they do most often with their Echo. The top three most common use cases done regularly were: Play a song (34 percent), control smart lights (30 percent) and set a timer (24 percent).
A few quick thoughts on Echo usage
Playing a song as the top use case is not surprising, given that the product is positioned as a smart speaker. Bluetooth speakers have actually sold well at retail. The idea of having portable sound around the house is compelling for consumers. It also makes sense as the entry point for a smart voice assistant, given the need for a speaker, microphone and accompanying components for microphone arrays and noise-canceling tech for better speech recognition.
Controlling the lights is, in my opinion, a solid indicator of voice-controlled smart-home technologies which will someday become commonplace. As our homes get smarter, it makes sense that the way we will interact with our smart objects is through voice. It may be the catalyst to drive the true smart automated home into the masses.
In terms of overall satisfaction from Echo, owners most were satisfied with the overall product, but satisfaction ranked highest when we asked specifically about the voice-recognition capabilities of the device. Owners felt that it delivered on recognizing what they were saying, and on performing the task they asked of it. This has a lot to do with the Echo’s microphone tech and noise-canceling capabilities, as well as its connection to persistently good broadband — which is often where Siri and OK Google break down when trying to use while driving and/or operating in areas of poor-quality service in mobile broadband networks.
Only 13 percent of Echo owners stated that they noted declining usage since they had acquired it. The top reason listed by those using it less was "the novelty of using my voice is wearing off."
Understanding how Siri and OK Google are used
Perhaps the most important observation we came away with from our study was that Siri is the most used voice-based user interface. In our mainstream panel of 518 consumers (44 percent iPhone owners, 4 percent Android owners, 2 percent Windows Phone or BlackBerry, 13 percent don’t own a smartphone), 65 percent indicated that they had used either Siri, Google’s "OK Google or voice search," or Microsoft’s Cortana. Of all three, only 21 percent had never used Siri, which compares to 34.8 percent who have never used Google’s voice solution, and 72 percent who have never used Microsoft’s Cortana. More consumers across the spectrum of operating systems (iOS, Android and Windows) have used Siri than any other voice UI. I credit the success of Apple’s iPad as assisting with this observation, since many Android phone owners, non-smartphone owners and Windows PC owners have iPads, as well.
Looking at how they used each voice UI, we see for the most part that people use Siri and OK Google/Voice search in the same ways on their smartphones. Contrasting these common usages against those of Echo, we see distinct differences in how having a voice user interface to a communications device like a smartphone differs from one that is stationary in the home and positioned as a smart hub versus a personal computing product like a smartphone, PC or tablet.
Search is the most common task done on smartphones or tablets using Siri or OK Google/Google Voice. Google announced at Google I/O that 20 percent of all Google search queries are now done by voice. Looking at the data, we can conclude that more voice search queries are done with Siri than with Google’s voice-based search. When I look at these most common tasks, they strike me as fairly basic, which is an important observation to understand, given where the market is today. These most common tasks may be simply because the products are still somewhat limited in their capabilities, but could also be because they are the ones that work the best and most consistently.
Overall satisfaction with the voice recognition of Siri and OK Google/Google Voice search was relatively similar, and only slightly different from the grades iPhone owners gave Siri and Android owners gave OK Google/Google Voice search. Both were also below 80 percent, which is not bad for where these technologies are today. The Echo’s voice-recognition capabilities yielded higher satisfaction rates than both Siri and OK Google/Google Voice search, but I interpret that due to the technological variables of being stationary, having better noise cancelation and a persistent high-bandwidth connection to the internet — all variables that impact the experience of voice-enabled user interfaces.
Finally, context of location usage for voice-based user interfaces is another important factor to understand. For those who use Siri or OK Google/Google voice search most regularly, the primary location is the car, with 51 percent of consumers saying that this is their primary location to use voice-enabled actions. The home was second, with 39 percent. From a cultural perspective, it should come as no surprise that both of these locations offer an element of privacy, which is why only 6 percent of respondents said they commonly use Siri or OK Google/Google Voice in public.
I walked away from this study with confidence that the voice-user interface has gone mainstream. What’s more, mainstream consumers seem to recognize its value and convenience. Consider these statements from consumers:
- It does not always work, but when it does it is very useful: 55 percent strongly agree
- I would use my devices voice capabilities more if I could speak to it more naturally: 43 percent strongly agree
- If it worked more often, I would use my device's voice assistants more: 48 percent strongly agree
- I want my device’s voice interface to integrate better with more devices and apps that I use regularly: 66 percent strongly agree
- I am not comfortable speaking to my technology: 41 percent strongly disagree
It is encouraging, from a sentiment perspective, that voice looks to be a natural extension of our keyboard/mouse/touch-based input and output methods. Consumers seem to recognize its value, and want it to work in more ways. I’ve long said that the true test of a great feature very early in its life cycle is when it combines both delight and frustration. Once you use it, you’re hooked, but you want it to be great all the time, because you can see the potential. This is why we snuck this question into the sentiment segment to see if consumers agreed: 47 percent strongly agree and 38 percent somewhat agree that when their voice assistant works, it's great, and when it doesn’t, they get irritated.
The battle for the voice-based assistant is on. This is another area where the one with the biggest ecosystem built around their voice UI/voice OS has the best shot of being "hired" by the masses.
Ben Bajarin is a principal analyst at Creative Strategies Inc., an industry analysis, market intelligence and research firm located in Silicon Valley. His primary focus is consumer technology and market trend research. He is a husband, father, gadget enthusiast, trend spotter, early adopter and hobby farmer. Reach him @BenBajarin.
This article originally appeared on Recode.net.