The Future with Voice Experiences

We are living in times where we are increasingly surrounded by voice assistants, voice ads, voice notifications, voice search and voice messaging to name a few. But we know that the time for voice has come when we see toddlers and teenagers picking up our phones and saying: ‘OK Google’ what is lego, really?

Mohita Jaiswal

Lisa: Make me a haircut appointment on tuesday morning between 10 am – 12 pm.

Voice Assistant: No problem I’ll make an appointment and update you soon.

Assistant makes a call to the Hair Salon.

Assistant: Hi i am calling to book a woman’s haircut for a client, I’m looking for something on May 3rd.

Salon Receptionist: Sure give me a second. What time are you looking around?

Assistant: 12 pm

Salon Receptionist: We don’t have a 12 pm available, the closest we have is 1.15

Assistant: Do you have anything between 10 am – 12 pm?

Salon Receptionist: Depending on the service she would like, what service would she like?

Assistant: Just a haircut for now.

Salon Receptionist: Ok we have a 10 am

Assistant: 10 am is fine…

Salon Receptionist: Okay whats her first name?

Assistant: Lisa

Salon Receptionist: Ok perfect,  i will see lisa on 10 am on may 3rd.

Assistant: Ok great thanks

Salon Receptionist: Great have a great day bye.

This was a live demonstration of the Google Duplex, an AI system for accomplishing real-world tasks over phone, causing a stir, as the CEO Sundar Pichai demonstrated how a fully automated assistant could actually make calls on your behalf and could understand the nuances of a conversation.


We are living in times where we are increasingly surrounded by voice assistants, voice ads, voice notifications, voice search and voice messaging to name a few. But we know that the time for voice has come when we see toddlers and teenagers picking up our phones and saying: ‘OK Google’ what is lego, really?

  • According to Gartner, the global market for smart speakers is expected to grow to $2 billion by 2020. 
  • 50% of all searches will be voice searches by 2020 (source: comScore).
  • Advances in machine learning mean that voice recognition is becoming more accurate, and more mainstream.
  • 13% of all households in the United States owned a smart speaker in 2017. That number is predicted to rise to 55% by 2022 (source OC&C Strategy Consultants).

With voice technology becoming mainstream, the overall goal is to leverage the communication system that users learned first and know best: conversation. This is because, as compared with touch – which requires greater concentration, speech can be a more accessible and delightful a way to interact with a small screen.

If you work on user experience (UX), this whole opening of voice, is going to be a departure from what and how you design for users on apps and web portals. When designing for voice, personas can have a different meaning – one which copywriters or audio producers might be more likely to recognise.

The Future with Voice Experiences

What’s the start of positive and clear voice experiences?

Imagine listening to a podcast. Since you can’t see the person speaking, you paint a picture of the character in your mind based on the voice and sounds you’re hearing and the words the character is saying. The amazing part about this is that a persona(a combination of elements of voice, sounds, words) can change your user experience from an interaction to a relationship.

There is actually a substantive body of research showing that users cannot help but create personality traits and social information, such as stature, behind a voice – even if they encounter them as a brief, recorded sample.

Hence, for creating a strong, consistent and engaging user experiences, voice personas can help to guide users to appropriate helpful interactions and increase engagement, conversion, guiding the overall way users feel about your brand or product.


Defining the language and tone of your voice experiences:

1. What is the emotional response you want to trigger with your VUI?

If you’re designing a financial skill, you may want to go for a persona that generates trust or credibility. While if you’re designing something related to health and wellness, you might want to create a sense of empathy. Based on the motivation behind your product and the emotional response you wish to generate, you can determine the voice of your persona.

2. What is the character sketch of your persona?

By defining an actual person in mind helps one in building a consistent persona, which users can relate with, regardless of whether you are working with voice actors or a text to speech software. Typically one could start by crafting the character’s values, personal details and back story. 

If you were providing  a real life education service, who would you ideally hire to talk to your customers? What’s their gender, age, and background? How would they speak- formally or like friends? Would they dress casually or business-like? Having a biographical sketch, helps copywriters write a script that feels appropriate to your brand and your action.

3. What is the right voice for your persona?

Rate each voice against the key values decided above to guide your final voice selection. You can use  text-to-speech (TTS) software which allows one to choose from many voices. Alternatively you could also go ahead and record voice actors to synthesize a more human feeling, where it is also beneficial if your brand already has a human ambassador. For example, Kids Court uses Amazon Polly software voices to make customers feel like the judge is impartial and objective. It also allows the makers to add fresh content on a weekly basis.

4. What will your persona speak?

Would your persona say sentences like, “Hello madam, how are you doing this evening?” or “hey buddy, what’s up?” It is important to think of a few key phrases that are likely to be a part of your VUI and how your persona would say them. Trying to be consistent with the character, yet writing for a variety of responses for every use case helps create a more engaging and real experience for users.

5. What do the dialogues sound like and convey to your audience?

Some sentences that read wonderfully on paper don’t sound quite right when spoken out loud. Also, some words won’t work well in TTS, and if you’re using a text-to-speech voice, play and edit it with SSML until it sounds great. Using short sentences, breaks, the right tone and pauses helps to create an experience that sounds more natural.

See how these examples sound like different restaurants, although the meaning is essentially the same?

The Future with Voice Experiences

Craft powerful Voice User Journeys

Don’t ask a question if you won’t be able to understand it. – Margaret Urban, Google’s Interaction Designer

With a voice user interface the idea is to keep the communication simple and straightforward between user and voice assistant. The experience needs to  be simple enough not to overwhelm the user with too much information, while also not giving them too little information that the conversation is filled with continuous questions and answers. Keep in mind while creating voice User Journeys:

1. Predict Scenarios

To achieve a seamless user interaction, create a list of user journeys and scenarios which detail the expectations of users. For instance: “As a user, I want to be able to place a quick order for food without having to pick up the phone or enter my card details.” With a list of scenarios in place you can create different dialogue flows for different use cases.

2. Enable Prompting

Sometimes users may speak statements which are not part of the direct dialogue flow, and hence while creating a dialogue flow, we need to consider prompts to guide the user in the right direction. ‘Do you want to complete the order before we move to the next question?’ Ending with a single question is usually an effective way of getting the ball rolling without being too overwhelming.

3. Support through Nudging

Various voice devices provide support by giving a ready reckoner of examples and questions the users can pose to achieve a certain outcome. This works well in helping users to know what one can ask, acting like a little tutorial on how a particular voice support works. Most voice based interfaces also continuously listen to certain ‘activation phrases’ which if embedded can prompt to execute user instructions for tasks.

4. Test for usability

By setting a series of tasks for users to complete, one can start to discover areas that users find unclear, misleading or simply irrelevant. When conducting a user test for a voice user interface skill/app, recruiting participants with different experiences using VUI makes feedback more insightful – particularly when concerned about learnability. Ask will the product be easy to use for people who are new to voice technology? How quickly will the new users adjust to the interface?

According to research by Amazon Alexa’s makers, they discovered that if interacting with the tool was too robotic or repetitive, users became annoyed. On the other hand, if users felt like they were talking to another human, their experience was more positive. It was therefore essential to adjust Alexa’s commands until they sounded natural, flexible, and preventing errors. 


Selling through Voice

Consumers all over the world are adding voice to their shopping cycle, using voice agents and interfaces like voice-enabled virtual assistants on their smartphones, voice searches and smart speakers. With voice technology opening up new avenues for brand and customer interactivity, marketing professionals need to find  innovative ways to respond to these newly opened channels.

What can a futuristic marketer do to attract his audience through the power of voice?

1. What is your brand’s voice strategy?

There is a lot of opportunity for much deeper and much more conversational experiences with customers through following the new voice technologies. It includes right from the basic questions as to which voice to adapt to answer your queries,what should be your brand voice persona,  whether to adopt a male or female voice for your brand, which voice channels to target and how formal should the conversation be and so on and so forth. Brands need to create new ways to create an exceptional voice experience and establish as market leader, by investing in a value worth futuristic technology.

2. How can you optimize voice search?

The rise in demand for voice search means that marketers need to place a higher priority on long-tail keywords and focus on natural, spoken language and a conversational approach. They now have to think about voice-activated content as search engines build algorithms around voice keywords. Today we already observe text captions and ads being overlapped over the youtube video. In a similar manner, the future of marketing could see audio streaming accompanied with text ads on an accompanying display, through recognizing the subject content in the played audio.

3. What are newly opened channels for advertising?

Advertising on the Voice Assisted Platforms like Alexa is slowly catching up in the advertising world. The difference between a voice ad and a radio ad is that listeners to this day still can’t respond to a radio ad in the way a smart speaker user can interact directly with the device. Voice message marketing could also turn out to be an interesting channel for marketing professionals to explore along with social media, SMS, direct marketing and so on. Voice message marketing could enable better user engagement and retention unlike text messages. Software Voice APIs on voice gadgets could proactively indicate that there is new content from core features, such as housing/shopping updates

Many industries that are using digital audio, in their day to day operations, could be the earliest to gain in this scenario. Industries such as telecommunication, music, radio,  film, games, retail, food, education, transportation, agriculture, travel, banking, healthcare and automation for home, office, industries, etc. would be early-adopters on this horizon of a new paradigm of voice marketing – once partnered with the search engine industry. 


A voice-first Future

There is a high possibility of hybrid voice-based systems getting innovated which capture search or any input text as well as voice content/voiceads to suit user’s mode of attention at given point in time depending on their personal mood at that time.

Imagine a future where consumer experience would have lesser screen interaction, lesser use of search boxes and a much stronger voice experience. People with reading disabilities, people with impaired sight or mobility and the elderly will easily be able to set up, monitor and carry on tasks with greater convenience. User convenience would be much higher as voice commands become handy while driving, cooking, allowing for ease of task completion and getting away with redundant tasks of personal and professional management.

As organizations embed voice deeper into their strategy, everyone will be assisted with a voice… assisted towards better decision making, doing and being.
Mohita Jaiswal

Mohita Jaiswal

Research, Strategy and Content consultant. With a master's from IIT Delhi, Mohita has diverse experience across domains of technical research, big data, leadership development and arts in education. Having a keen interest in the science of human behavior, she looks at enabling holistic learning experiences, working at the intersection of technology, design, and human psychology.


Share on

Was this Page helpful?

Suggested Read
Wearables UX & Smartwatch UI Design & Development
Suggested Read

Wearables UX & Smartwatch UI Design & Development

11 Elements to Consider When Designing a Mobile Application
Suggested Read

11 Elements to Consider When Designing a Mobile Application

Atomic Design: A Guide to Improving Workflow
Suggested Read

Atomic Design: A Guide to Improving Workflow

Services & Expertise

Sign Up for our Newsletter

Subscribe to our newsletter to stay updated with the latest insights in UX, CX, Data and Research.


Get in Touch

Embed page