In collaboration with Levi Warvel and Mariano Rodriguez.
A lot goes into a designing a good conversational UI; it needs to have a personality, it needs to adapt, it needs to be able to engage with the user and it needs to be natural for the user to interact with. The challenge for designers is to design this UI to be as intuitive and simple as possible, but without a traditional on-screen UI. When designing a conversational UI experience, it is essential to try and capture the flow of a conversation so that the interaction between the device and the user is seamless. In order to do this, it is essential to understand the basics of human conversation.
In the past these interactions between user and device can be clearly seen on a screen, so the interaction does not need to mimic the flow of a conversation. Even Siri and Google Assistant leverage an interface, but now Conversational UIs rely completely on the voice and have to fit the flow and cadence of a real human conversation. To achieve the natural flow of a human conversations, designers have to use different tools than they would traditionally to create the user experience. Conversational UI needs to provide users with information about what they can and can’t do, to do this designer need to use the AI to lead the user down the correct path, reminding them where they are, limiting the possible ways to answer and the amount of information being given, and providing examples of the correct interaction when a user is lost.
As a result, there are very important aspects of a human conversation that must be incorporated when designing a conversational UI experience: turn taking and the ability to understand voices and inflections.
Turn-taking is the most basic coordination mechanism of human language dynamics, and one of the most important basic structures of human conversation. Human conversations consist of taking turns speaking through subtle signals that are passed back and forth between the two parties. Without turn taking, conversations would become jumbled and the ability to get ideas or thoughts across would become very difficult. As a result, it is essential to incorporate this basic function of human conversation into the design of the conversational UI. The UI must be designed to be able to pick up on cues that indicate when it is it turn to speak.
Ability to understand voices and inflections
Each language has its own unique lexical structure, including tonal qualities that indicate the end of an utterance. The gaps in speech flow, in turn, indicate an opening for potential respondents to engage with the speaker. However, the exact nature of such lexical structure varies from language to language. Some research has indicated that variation in speech turn transitions falls within a general average of 200 ms between the conclusion of speaker utterance and engagement from the respondent (Stivers et al, 2009, DOI:10.1073/pnas.0903616106), which provides a good guideline around which to design engaging and satisfying conversational interfaces. To build a conversational UI that can be intuitively usable by the majority of international consumers, the system needs to have the ability to parse relevant information from the unique characteristics of any given language. It is critical that the conversational UI be able to understand different voices (i.e. two people talking) as well as different inflections and accents that may be present.The best way to do this is the inclusion of a calibration check early in setup process of the conversational UI. This allows the system adequate opportunities to identify relevant information as well as acclimate to user speech characteristics. If possible, calibration checks should be integrated into standard operations as well.
When building a Conversational UI there is a lot to consider, the most important being that the experience has the same natural flow as would a natural human conversation. Often the focus is building a powerful system that has a lot of functionality, but the design and the UX of the Conversational UI are just as important. By creating a structured conversation, a Conversational UI can easily serve up the correct information or carry out the correct commands. It is important that Conversational UI is natural and does not feel like they are talking to a robot, and that is where turn taking and understanding voices and inflections becomes crucial.
READ MORE: Conversational UI and Hybrid Interfaces, What is Conversational UI?, What You Should Know About Zero UI, Tamagotchi Gestures and UX Design, Why Conversational UI is So Successful, The Role of UX in Conversational UI