Friday, July 7, 2017

Best practices for lively chatbots

TJBot as lively chatbot
More and more chatbots are being developed and there are good reasons for it. Not all chatbot projects succeed. Often, missing user acceptance is stated. The dialog system might not have hit the nerve, might not have fitted into the target environment. Would you talk with a friend who does not remember your name is repeating the same five phrases over and over again? I would not. So what can be done to make chatbots more lively, more human-like? Here are some best practices and ideas on how to implement them.

Introduction

I started my series on chatbots with lessons and tips from a chatbot hackathon. In that blog I focused on general aspects of building dialogs and designing a conversation system. The language needs to fit the audience. It is something we will look at again today. In a recent blog post I shared tips and tricks for building chatbots. It is possible to carry context throughout a conversation and embed conditions and advanced expressions into the dialog flow and single reponses. We will use that to implement some of the best practices found below. Building lively chatbots could also mean to give the bot a face. The open source project TJBot (pictured) is an example for that. The TJBot can listen, speak and see, give additional feedback and interact through its arm and its light. We won’t cover those aspects, e.g., hardware design or user interfaces, in this blog entry.


Best Practices for Lively Chatbots

You understood the basics of chatbots. You know about the advanced capabilities of the IBM Watson Conversation Service. Now, let’s take a look at some guiding principles (best practices) for designing lively chatbots. They help to make a conversation interesting to the user. They are critical for a successful dialog-based solution:
  • Freedom and Flexibility: Guide the users, but don’t restrain them.
  • Variety: Don’t repeat yourself and tailor the messages to the situation.
  • Flow and State: Have follow-up questions where needed. Let the conversation flow and remember what you already did and where you need to go.
  • Information Gathering: Keep tab of what you need to know and what you already know. Gather the relevant information.
  • Mood and Tone: Try to “get” the user mood and adapt the conversation to it.
  • Learning: Analyse finished chats. Improve future conversation by training the system.
What do those recommendations mean in practice? In the following, I am going to provide some ideas of how the best practices can be implemented with the IBM Watson Conversation Service (WCS).

Freedom and Flexibility

The WCS has the concepts of intents, entities and dialogs. Developers create individual dialog nodes within a dialog. Each node can have multiple responses which Watson can pick from. This can be either sequentially or randomly. In a conversation flow, conditions can be placed on nodes, so that they are only visited in specific situations.It is possible to jump to other nodes in a dialog to react to user input and a situation. Moreover, expressions to evaluate variables and the conversation metadata can be embedded into reponses. Thus, a high degree of freedom and flexibility can be achieved.

Variety

Using context variables and conditions, the techniques described in the recent blog, reponses can be tailored to the current state of the conversation. A response could be a template, not a static text. User-specific and flow-related data is then filled into the response template. This can be done in WCS directly using embedded expressions. Another technique is to place special markers or keywords into a response, e.g., USER_NAME or ORDER_LIST. Then, the chatbot application could replace the markers with the actual data. A simple example is in the TJBot “Tell the time” recipe. In the tellTheTime function the marker “todays_date” is replaced by an app-generated, timezone-dependent string with the time.

Other applications use an indicator as a WCS-delivered response, e.g., answer_appointments or list_guests. The app creates the returned response from strings fetched from a database and maybe based on templates. In that case, Watson responses help to compose context-specific messages. The chatbot application generates the final answer.


Chatbot Architecture
It is important that answers vary to keep them interesting. Build answers based on the information the bot has like a real person would do.

Flow and State

Watson Conversation maintains a conversation state and context. The metadata helps drive the conversation. Conditions on that metadata and user data should determine the dialog  flow. Most users are not chatting for fun but for purpose. Hence, each turn the chat needs to move towards that goal.

Information Gathering

Users provide data to reach a goal, e.g., booking a flight, obtaining status information or ordering goods. They want to provide that data only once. Thus, it is important to know the required data to finish that process. Furthermore, each bit and byte obtained from the user needs to be remembered. IBM WCS has the relatively new feature of slots. First, slots allow to define what data is needed overall. Second, slots help to easily keep tab of what information was entered by the user and to tailor questions to only ask for what is still missing. Thus, slots improve the user experience.
Data from slots is kept in the conversation context. The context is the “dialog memory”, it carries the user data throughout the dialog flow. See my previous blog with tips and tricks on using the chat context.

Mood and Tone

Do you know Marvin, the depressed and bored robot from a Hitchhiker’s Guide to the Galaxy? Marvin seems to be misplaced (by purpose). How would you feel about a cheery chatbot when you are discussing your credit score? Or about a very formal conversation for ordering fast food? So, be sure to design the dialog to match the situation. Moreover, consider dynamic analysis of the ongoing conversation. IBM offers the Tone Analyzer, Natural Language Understanding and Natural Language Classifier services to extract the current sentiment and to further annotate and analyse what the user is saying. Thereafter, the techniques discussed above can be used to react with an appropriate, varied, tailored answer that shows the bot understood and remembers what was chatted so far.

Learning

The IBM Watson Conversation Service is a cognitive service. Hence, a core feature is to learn and to improve. Developers enhance the learning and training by analysing finished conversations and by providing feedback to Watson (how do you learn?). By reviewing logs and, if necessary, changing detected intents, the service learns. Conversations get better and better. Also, by looking over finished conversation, developers also learn how well the chatbots are working.

Conclusions

In this blog entry I provided some best practices to build lively chatbots. User-centric design (“how would you (want to) chat with a real person?”) is essential. The recommendations discussed above should give you ideas to design such bots. The IBM Watson Conversation Service provides the technical foundation.

If you have feedback, suggestions, or questions about this post, please reach out to me on Twitter (@data_henrik) or create an issue against the mentioned GitHub repository. Moreoever, check out the tool I created to manage conversation workspaces.

(Note: Another copy of this blog entry will appear on the IBM Bluemix blog)