| |

Inside VOICE: Designing the Westworld Alexa Skill and Voice Casting for Brands


Have you played the award-winning Alexa skill: Westworld: The Maze?

In this skill Alexa immerses you in the fantasy playground of Sweetwater, where your mission is to find the center of the maze. You can converse with different characters, hear the trains steaming through town, catch the notes of a piano playing nearby, and strategically choose your next move (to avoid an unexpectedly violent death).  

 The story-based skill has been described as somewhere between a "choose your own adventure" audiobook and a hidden episode of the show, with voice actors from Westworld adding an extra sprinkle of wonder. On the other hand, computer conversation designer, Kat Zdan, describes the wide-scale skill as "inspiring and intimidating". 

Kat Zdan is one of the masterminds at Xandrathe conversation design studio that partnered with HBO to create the Westworld Alexa skill. In January 2019, Zdan attended the Alexa conference in Chattanooga, Tennessee, where she caught up with James Poulter from the Inside VOICE podcast.

You can listen to the full 19-minute episode here. 

If you don't have time to listen to it now, here's a brief overview to whet your appetite. 

Working on Westworld

James Poulter (with all his delightful "British-isms") jumped into the conversation with Kat Zdan by asking what it was like working to create the Westworld Skill. 

Zdan explained that it was infinitely inspiring due to the brilliance of the source material but equally as challenging because of the content-hungry fandom behind Westworld. Not to mention the intimidating task of sifting through extensive narratives, meta-themes, and the world outside the show.

"It's mind-boggling," said Zdan, "But it's inspiring to think 'I need to make something that is worthy of this property'." 

On her experience with voice

Before stepping into the voice-first space, Zdan was an actor and voice actor with over a decade of experience and zero intention of moving into a different industry. 

Except, once she stepped into the voice-first industry, she realized it wasn't all that different from her original line of work. In theater, her area of expertise was developing and refining personas, characters, and narratives—all of which resurfaced when creating games and interactive experiences.  

"I was shocked at how much of my background in the theatre was 1:1 comparable to what I do now."

But she did face one major learning curve: the economy of language. Turning six sentences into just one to provide the user with exactly the right amount of information.  

"When you're doing a voice experience, you need to get the user all the information they need and no more. You don't want to overload them, overwhelm them, or bore them."

The importance of using original voices 

From a brand perspective, you likely don't want Alexa as the voice of your brand. Zdan insists that using an original voice brings significant value to the user's experience with a brand. Human voices give room for the use of accents, different cadences and, most importantly, personality.

Text-To-Speech fonts may provide different word choices to denote character but they can't add expressions or "soul" behind those words. 

"It becomes an extension of your brand," Zdan says, "It's a familiarity users can connect with and gives listeners the opportunity to get to know you." 

The downside, however, is once you've recorded a human voice for your content and decide to update that content, you'll have to get the human back in the booth.    

How to cast the right voice

The first question on Zdan's list is finding out who the users are, followed by what the intention is and how the brand is trying to connect with them. 

"You can go in different directions," she says, "Sometimes you want to cast a voice that sounds like your target demographic or a voice that you think your target demographic may enjoy." 

Zdan, for example, has a British male as the voice of her Siri because she finds his voice soothing and enjoyable. (She then told James Poulter to set a timer for ten minutes.)

Although she has also encountered brands that cast a voice from a library of samples simply on the basis of "they like that one best". 

The key to designing interactive stories

The choose-your-own adventure format for storytelling has been around for over a decade, but is just now getting the attention it deserves in the mainstream media. For many, the eye-opening moment happened with Netflix's Bandersnatch episode. But long-time gamers are overly familiar with this format and are also painfully aware of its downfalls.

Zdan explains that a bad choose-your-own-adventure gameplay provides the user with a "pseudo choice" where neither option really matters and the user is led down a predetermined path. Whereas a well-written choose-your-own-adventure gameplay gives the user the opportunity to "make moral decisions and truly shape the narrative," where each choice is "meaningful and each action has consequences".

She warns that simply adding branches to the narrative will only take your story so far, since the novelty of making in-game decisions wears off pretty quickly. Zdan recommends making your user feel truly in control for an interactive story worth playing.

Equality in the voice-first industry

From Poulter's point of view, the voice-first space is seemingly "more diverse and gender-balanced," at least in comparison with other areas of tech. He then asks Zdan for an insider's perspective on the matter. 

She honestly responds that since she's relatively new in tech, she's not entirely sure. What she can say for certain is she has worked with incredibly talented women in this industry and there is no shortage of inspiration for other women hoping to step into the voice-first world.  

The evolution of voice

The energetic conversation is drawn to a close with a final question on what Zdan is most enthusiastic about regarding the evolution of voice in the months leading up to the VOICE Summit.

"I'm excited to see more custom voice over, more refinement in the kind of experiences we're having, deeper thoughts and more human-centered design in the applications." 

She adds that she'd love for companies to go beyond the novelty of voice-driven conversations and focus more on the quality with which they're delivering them.  

Finally, Zdan places her bets on Xandra being the company to break out the "bizarro, out-of-the-box" voice experiences in the near future, concluding her answer with, "I'm excited for things to start to get weird."

Meet Kat Zdan 

You can catch more of the loquacious Kat Zdan at the VOICE Summit, hosted once again at the NJIT in Newark, New Jersey this July. Save your seat using the discount code FIRST500 to hear Zdan and 500+ speakers on the inner workings of voice-first experiences and what's in store this year and beyond.

Remember, you can listen to the entire podcast here and follow VOICE on Twitter to stay in the loop for more.

Inside VOICE Podcast, Voice Technology

Jenny Medeiros

Written by Jenny Medeiros

Jenny is an engineer turned tech writer with hands-on experience in VR, AR, video game development, and UX-focused web design. Nowadays, she partners with tech companies to help explain emerging technologies simply. When she's not writing, she's likely daydreaming and forgetting her tea.

Newsletter Subscription