You probably use voice technology every day without realizing it. Each time you dictate a text or tell Netflix to search for the latest rom com, that’s voice tech in action. We have these AI-enabled voice communications with machines all the time, and soon voice-first is how most of us will interact with all our devices because it’s so much faster and easier. Tapping, typing, and swiping will be a thing of the past. “Over 70% of consumers say they prefer conducting voice searches to manually entering their queries because we speak three times faster than we can type on a keyboard and five times faster than tapping on a mobile device,” says Tobias Dengel, president of WillowTree, a TELUS International Company, and author of the brand new book, The Sound of the Future: The Coming Age of Voice Technology. (October, Hachette Books)

Right now, you can use voice commands with smart speakers such as Amazon Alexa, Apple’s Siri, and Google Assistant. Sephora partnered with Google Nest Hub in 2018 to allow users to search for and order beauty products and access Sephora’s YouTube tutorials, and Coty released a voice-activated Clairol Color Expert assistant feature on Google the same year. Shiseido launched a voice assistant via Amazon, and of course, you can currently tell Alexa to buy just about anything on Amazon Beauty. Voice-based shopping is changing how we buy products and interact with beauty brands, and these home platforms are the first stage of what the technology can do.

So, how does voice tech work? “It’s a combination of three key processes,” says Dengel. “The first is ‘natural language processing,’ which transcribes what you say to the device, and then ‘natural language understanding,’ which changes spoken words into a command that the system can understand. Finally, the response mechanism replies to this command.” This call and response system is evolving from voice-only (Siri giving you a verbal answer) to getting a visual, actionable response on your screen in real time. While voice-only works well for a GPS when you’re driving, it can be frustrating when you want something like a personalized list of recommendations. “We don’t want our device to tell us this information. We want a screen to show us the choices so we can then tell it what we want,” says Dengel. “This is the core multimodal experience that is going to explode. Conversational AI is the voice search piece, and generative AI analyzes that query and comes back with a visual selection of options based on ratings, reviews, and your personal preferences. These kinds of algorithms are going to proliferate throughout retail.”

Because we’re all shopping on the go, soon enough we’ll bypass smart speakers completely, in favor of mobile apps. According to the Pew Research Center, about a third of U.S. consumers use their smart phone to buy something each week. By 2025, mobile m-commerce sales in the U.S. are expected to reach $710 billion, up from $360 billion in 2021 (according to Statista). “In a 3 or 4 years, most of the shopping experiences we have are going to be voice-first,” says Dengel. “Mobile apps like Spotify and Waze are already using voice-first commands, and in the last month, Door Dash launched a whole voice layer for ordering food on your mobile device.”

While voice tools make shopping faster, easier, and more customized, they can also be potentially game-changing. Estée Lauder just launched the Voice-Enabled Makeup Assistant mobile app, which helps visually impaired users more easily and confidently apply makeup by scanning the face and offering audio feedback. “This is a perfect example of a multimodal voice experience that can theoretically solve a problem and provide accessibility and inclusivity,” Dengel says.

It takes time for the industry to figure out how to implement all of this new AI and voice tech and for consumers to get used to talking into their devices. However, tweens and teens are doing it now. “For my kids, it’s second nature to talk to devices like the TV or a smart speaker, or phone. They would never scroll or type,” says Dengel. “That’s where all of this is going. Eventually voice will be how we all interface with brands and retailers. It will be so much easier, especially with beauty companies that have lots of different products, to tell the app what you want and then see a selection of accurate recommendations without tapping and swiping. It also creates opportunities for even more personalized shopping experiences.”