No, not really. Voice will not replace your website. At least not any time soon. But by now, it should be obvious that we’re entering a new era of digital experiences—an era where voice assistants, chatbots, and other virtual, conversational assistants work in confluence with traditional Web and mobile applications.
One thing that hasn’t changed in the digital world is the importance of content. More than 20 years ago, Bill Gates declared “Content is King”. This reality brought about the concept of digital content management and all its variants—along with an industry full of tools, technologies, and approaches to support content management strategies. If Voice is the next user interface, and conversational experiences the next digital channel, then it follows that content management strategies need to evolve reflect the unique requirements of conversational, voice-powered user experiences.
In this post, we’ll explore some of the nuances of content management in voice-based applications. Additionally, we’ll offer tips on how to prepare your content management strategy for the voice-first digital world.
Content management for Voice
The problem with existing digital content is not just that it’s not designed for voice-first applications. The problem is that organizations lack the tools to develop, manage, migrate, curate, or otherwise prepare their existing content for these new conversational channels.
For most organizations working in voice, the state-of-the-art for voice content development involves a series of awkward and time-consuming steps to convert existing content to voice-ready form. These steps include a lot of copying, pasting, and editing by software engineers, not subject matter experts or content specialists.
But if we look at previous digital experience disruptors, this “teething phase” is to be expected. In the early days of web and mobile, tools to streamline content for these new digital channels didn't exist. The result was a lot of hacking and a lot of poorly designed experiences that did not deliver. In time, content management systems evolved that allowed non-technical people to manage content and, eventually, the complete user experience of web site and mobile app users. The same will inevitably happen with Voice.
Best practices for content management in Voice
Much has been written about designing user experiences for voice applications, but less attention has been placed on preparing existing content resources for the many unique aspects of voice and conversational applications. Consider this healthcare example:
A standard assessment for rheumatoid arthritis asks the patient to select from a list of 50 symptoms that they may have recently experienced. In a web or mobile app, the approach to this question is to simply provide a list of symptoms with check boxes. Attempting to map this sort of question, as is, to a voice assistant doesn’t work. A voice assistant must mimic a human interaction. So, just as a nurse would phrase it, the virtual assistant might instead say, “Please tell me any symptoms you may have experienced since your last assessment”. A well-designed voice application can then apply natural language processing techniques to capture and encode the patient’s responses. (For example, “I’m feeling a little dizzy” = “mild vertigo”).
That said, here are six specific tips for preparing existing content for voice-first applications.
1. Consider answers, not just content
It’s not practical to have a voice assistant simply read two pages of text off your website—even if they contain valuable insights. Anyone exploring Voice in their organization will end up confronting the fundamental challenge of re-purposing existing digital content to the peculiar requirements of conversational applications. Tools for creating and maintaining voice-ready content derived for existing sources are just now entering the market.
2. Optimize for Automatic Speech Recognition (ASR)
Any voice application, whether an assessment survey, like the earlier example, or simple question answering (FAQ) application, must be able to accurately recognize a user’s spoken word to properly interpret their input. For example, a patient answering the question “how many hours did you sleep last night?” may answer “eight”. The ASR rule is to know that, in this context, the patient is saying the number 8, not the word “ate” or any other homonym.
3. Plan for speech synthesis
The other side of the coin from ASR is speech synthesis. Content management for Voice includes ensuring that words and phrases are pronounced properly by the voice synthesis engine. This is never a guarantee with any of the commercial voice engines on the market (e.g., Amazon, Google, IBM). The problem is more obvious when it comes to organization-specific concepts (e.g., medical facilities, acronyms, etc.). Controlling how concepts are pronounced and simplifying this control through intuitive tools that non-technical staff can manage is a priority for any voice application development project.
4. Consider the content for “intent handling”
Conversational technologies provide a way to map user “utterances”(what they actually say) to specific “intents” (what they generally mean). For example, the utterances “I’m thirsty” and “I’d like a drink of water,” are both variations of the same intent. This is where AI/machine learning come in. Most virtual assistant technologies provide a way to train intent recognition with example utterances so that the assistant can learn how to handle variations. Look for tools that will allow your content managers and digital experience managers to control the rules and training inputs of these intent handlers.
5. Use empathy modeling
Although it’s clear that conversational virtual assistants are artificial, we generally expect them to be human-like in how they interact with the end user. This goes beyond just making sure the virtual assistant is pronouncing words properly. It involves adjusting content to provide a more natural, less clinical, response. Cadence of speech, inflection, and tone are elements of speech that should be controllable. Also, the inclusion of normally throw-away words and phrases like “OK,” “I understand,” or even just “Mm hmm,” create a more natural experience that can improve user engagement in voice-powered applications.
6. Context awareness
Voice assistants often serve multiple purposes. It’s not unusual in a healthcare application, for example, for a voice assistant to provide a combination of health assessment and health education in a single application. Like a human, a voice assistant should be able to hold a conversation by providing the right response (or question) in the right context, handling context-shifts in the middle of a conversation, and disambiguating context when necessary (“Do you mean…?”).
Enabling content for voice assistants has many challenges, but careful planning and the right tools will ensure that your content is ready for the voice revolution.
At VOICE 2019, President and Co-founder of Orbita, Nathan Treloar, will present Voice Assistants in Healthcare —Are We There Yet? on July 24th at 1PM on the main stage. He will moderate the panel session: The State of Voice and Chatbots in Healthcare on July 23rd with executives from Deloitte, Brigham and Women’s Hospital, Merck, and Philips. You can also visit Orbita at Booth #317 on the VOICE 2019 Expo floor.