Google’s Android dominates the smartphone market overall, but Apple has attracted a disproportionate share of high-end users — and consequently an outsize share of smartphone profits.
At a Tuesday event, Google unveiled a two-pronged strategy to change that. Part one was the Pixel, the first smartphone that will be designed and manufactured by Google. Google is betting that building its own phone will allow it to offer the same kind of seamless user experience Apple provides its own users.
But the second prong of Google’s strategy is more original and received more attention on Tuesday. The company wants to make voice-based artificial intelligence a much bigger part of how people interact with their smartphones. Google envisions a future where you’ll make restaurant reservations, look up photos, and play music by talking to your phone instead of tapping and swiping on its screen.
Obviously, this isn’t a totally new idea, as all the major smartphone platforms have had voice-based personal assistants — Apple’s Siri, Microsoft’s Cortana — for several years. But Google says it’s about to make this technology a lot better — so much better that people will use it a lot more.
If anyone can pull this off, it’s Google. Making AI really good requires a lot of data to “train” sophisticated machine learning algorithms. Wrangling large amounts of data has always been Google’s specialty. But even if the company can build a voice-based AI that can really understand a wide variety of requests, I’m still skeptical it will change the smartphone game as much as Google hopes.
Apple’s knack for design is a big advantage in the smartphone market
Apple has a designed-focused culture and is known for its polished, elegant, and user-friendly user interfaces. By contrast, Google has a culture that’s focused on delivering fast and reliable online services.
Google’s business model for Android puts it at a further disadvantage in the user interface department. Apple designs both the hardware and software for the iPhone, allowing it to guarantee users a seamless experience. Google, by contrast, licenses Android as open source software to dozens of smartphone manufacturers, many of which customize it themselves, leading to a cacophony of different and often mediocre user interfaces.
The iPhone’s greater polish is a big reason the iPhone is disproportionately popular at the high end of the market, and why Apple is able to charge a healthy premium for the iPhone. Android has the biggest market share in the smartphone business overall, but Google earns much less in profit from the smartphone business than Apple does.
Many iPhone users nevertheless enjoy Google services like search, maps, and Gmail. But the fact that the iPhone’s operating system sits between Google and many of its users gives Apple a lot of leverage. In 2014, Google paid Apple $1 billion to maintain its status as the default search engine on the iPhone.
Google is trying to build a truly useful voice-based assistant
The executives onstage never said so explicitly, but several of Google’s announcements on Tuesday were clearly aimed at knocking Apple from its high-end smartphone throne. Most obviously, the Android Pixel is Google’s most direct challenge yet to the iPhone.
Google’s earlier line of Nexus phones were designed and manufactured by third parties; by contrast, Google is planning to bring most of this work in house for Pixel. The hope is that by owning the whole “stack” — software, hardware, and online service — Google will be able to match the seamless user experience Apple has long offered to its users.
But Google doesn’t just want to ape the iPhone; it wants a way to differentiate its products from the iPhone. Google believes its real secret weapon is a new user interface based on voice recognition and artificial intelligence.
In a sense, this is just a souped-up version of Google’s existing voice recognition feature, Google Now. Apple and Microsoft have their own competing versions, Siri and Cortana. And these products don’t seem to have had a big impact on the market.
But Google believes that’s just because the technology is not good enough yet. Google has been working to improve its voice feature in three directions:
- Google is planning to improve the voice recognition capability itself, improving accuracy of the speech recognition and increasing the complexity of requests it can understand.
- Google has been working on other AI-based technologies — like advanced image recognition — to expand the range of requests Google’s personal assistant can handle.
- Google is expanding its personal assistant to more devices, particularly Google Home, a new wifi-connected speaker that competes with the Amazon Echo.
An example can help to illustrate what Google has in mind here. Right now, if you want to look at photos from a vacation you took last summer, you’d open your photo app and scroll back until you find the date you want. Google envisions a totally different approach. You’d say, “Okay, Google, show me pictures from my vacation last July.” Android would understand the request and call up the photos.
Google’s Photos app demonstrates that Google is already well on its way to developing the image recognition technology to make this work. You can already ask Google Photos to show you photos that contain snow, or a dog, or a particular friend. Google hopes to bring these capabilities — and more — to its personal assistant, so you can ask complex queries like, “Show me pictures from 2014 that have Aunt Lisa and dogs in them.”
It’s easy to be skeptical of this kind of thing, since the existing smartphone “personal assistant” technologies aren’t very good. They get confused often enough that it’s usually easier to just do things the old-fashioned way. But artificial intelligence technology is advancing rapidly, and Google insists that it will soon be good enough that voice-based personal assistants will “just work.”
If voice-based search becomes capable enough, it could reach a tipping point where it’s easier to just ask the voice assistant for the information you need than to open up the appropriate app and find it the old-fashioned way.
This would be particularly good news for Google because it would play to the company’s strengths. It would essentially be putting search back at the center of the user experience.
Making this kind of smart voice assistant work will require oceans of information. Computer scientists have found that tasks like image and voice recognition work best when they have huge numbers of examples they can use to “train” the algorithms. A smart AI system also needs to know lots of facts about the world so that it can respond to complex queries. Collecting and organizing information has always been one of Google’s strengths — after all, Google’s mission statement is to organize the world’s information.
At the same time, it would downgrade Apple’s big strength — the ability to make elegant, user-friendly devices. If the main way people interact with smartphones is by asking them questions, the particular device will become less important — just as the advent of the web made the difference between PCs and Macs much less important.
Google is also hoping to continue the shift toward more and more user data stored online instead of on users’ local devices. Pixel buyers get unlimited, free storage for their photos and videos. That’s a good selling point for the Pixel, but more importantly, it will mean that Google can offer users access to their content from any device. Google envisions a future where users ask their Google Home smart speaker to put photos from their vacation on the TV. That kind of future would play to Google’s strength — managing massive quantities of data online.
Why AI might not be the secret weapon Google is hoping for
It’s easy to understand why Google would like to essentially make voice-based searches a central part of how people interact with their phones and other devices. But even if Google manages to build a sophisticated voice assistant that can respond to a wide variety of requests, I’m skeptical that will give Google a significant leg up in the smartphone wars.
The canned demos in Google’s presentation were impressive, of course — canned demos usually are. But the question to ask is how much of the time we spend interacting with our smartphones would be improved by a smart voice assistant.
One problem is that talking to your phone isn’t always convenient. There are many social settings — at the office, in line at the grocery store, on the bus — where people around you are likely to be annoyed if you’re constantly barking commands at your phone. In those situations, you’re going to want to discreetly type or scroll your way to the information you’re looking for, so people are still going to care how user-friendly the old-fashioned touch-based interface is.
But more importantly, a lot of the time people spend on their phones — perhaps most of it — just wouldn’t be improved by a personal voice assistant. People spend a ton of time scrolling through posts on Facebook, Twitter, or Instagram, reading text messages, swiping through Tinder profiles, and so forth.
Even with a task like finding photos, scrolling quickly through photo thumbnails will often be an easier way to find photos than trying to describe the photo you’re looking for. Often if you’re looking for an old photo, you don’t remember exactly when it was taken or what was in the photo. It’s helpful to scroll back to roughly the right time, look at a few photos at random, and use that to jog your memory about the context of the photo. This kind of browsing will probably always be faster on an old-fashioned touch-based interface than with voice-based queries.