A guide for designing friendly, helpful, and well-behaved chatbots
What is friendly?
“Friendly” is a human, interpersonal quality. All other meanings are derivative. We perceive actions of others that make us feel good as “friendly”, and readily extend that definition to the actions of machines and computer interfaces. Unfortunately, we have all been indoctrinated over the last few decades into the false tradeoffs of “friendliness vs. efficiency” or “ease of use vs. power”. For a first-time Unix user, perhaps a “friendly” command to list a directory may be “would you please show me the content of the current folder” and the outlandish, esoterically cryptic “ls” probably looks like an epitome of coldness and hostility. At the same time, if seasoned sysadmins are ever forced to type that “friendly” version instead of ls, or trade the latter’s wealth of options for “simplicity”, chances are they would view the situation as distinctly unfriendly. So, the “friendliness” of tools is in the eye of the beholder.
When it comes to interpersonal relationships though, barring a small percentage of sick individuals, everybody’s definition of “friendly” is pretty much the same. The further we try to apply the adjective from its original context, the more divergent the definitions become, culminating in the military slang that can refer to a “friendly” 500 kT warhead used to exterminate millions of “hostiles”, who in turn are deeply unconvinced of the weapon’s “friendliness.”
Conversely, the closer a situation is to an interpersonal relationship, the easier to agree on what is “friendly”. And in the user interface world, it doesn’t get much closer than a natural language interaction. Front any system with an NLU, and suddenly it becomes easier not only to accommodate all kinds of users, but also to design a friendly interface with confidence: the designer, being human, has the innate ability to recognize a friendly dialog.
For example, would a human consider a person that ignores their attempts to strike a conversation to be friendly? Certainly not, and a bot would be judged in the same way. At the same time, an overly chatty bot that doesn’t wait its turn in a group conversation and blabbers in response to every phrase uttered in a chat room is sure to be treated the way human beings with similar behavior are – or worse.
Key points: Friendliness is the natural state of most humans and is an absolute must for any construct that strives to emulate them. Luckily, it’s easy to achieve and even easier to verify.
What is helpful?
Anybody who has ever called a technical support organization knows the difference between “friendly” and “helpful”. Apparently, it’s much easier to train your human agents to be friendly than helpful (although I’ve seen examples of quite helpful, yet a bit, shall we say, emotionally reserved representatives) Is helpfulness similarly difficult to achieve with bot agents? In a way it is, due to the fact that a bot is just a projection of the strengths and weaknesses of its human designers. Call center employees and bot designers can be equally lazy and inert and impact their customers in a very similar way: the former directly, and the latter by projecting their shortcomings through bot behavior.
When writing dialogs, we’re doing so based on our own experience of interaction with other humans. As such, we instinctively demand certain amount of respect and dignity for the bot, as a partial instance of ourselves. Also (and that is even worse), we dictate certain “fair” split of the work: I’ll do this if you do that, for example “if you want me to create a lambda function for you, be so nice as to read the help on the subject and learn the exact way you’re supposed to ask me for that”. We demand certain equality with the user, after all we’re all humans. But “we” are not! A bot shouldn’t be overly obsequious and self-degrading, if only because that would make the human users uncomfortable (they also suffer from the same delusions of human-to-human interaction as the authors). However, a bot should always offer to do as much of any task as it’s capable of, and otherwise assist its human users. For example, if the user asks, “how are my training pods doing”, a mediocre bot will detect a “get pod status” intent, and will elicit “Please, specify a list of pods you want the status for.”
This is a classic example of the “I’ll do this if you do that” pattern of unhelpfulness: the bot is making its help conditional on the human user supplying a list of the precise names of the “training pods”. Assuming a user that is not exactly a Kubernetes wizard, we can easily see the following sequence:
- a web search for “how do I list all pods in a deployment”, and upon learning that:
- “how do I find out k8s deployment labels or selectors”, and after some more reading:
- “kubectl describe deployment rigd-train”,
- “kubectl get pods -l=service.name=rigd-train”, and finally, after much copying and pasting, the “correct” message that the bot demands:
- “get pods status for pods rigd-train-6b86c69d7f-nfpzb, rigd-train-6b86c69d7f-xcvkv”
The bot not only made the human jump through hoops to obtain the condition-for-help information, it reduced its own role to irrelevance in the process, because by retrieving the names of the training pods, the user already saw “how they were doing”.
What is the right behavior in this case? A helpful bot will also extract the value “training” from the “deployment name” entity, will perform a fuzzy match on the deployments in the Kubernetes cluster, will tentatively match “rigd-train”, and will display
rigd-train-6b86c69d7f-nfpzb 1/1 Running 0 2d
rigd-train-6b86c69d7f-xcvkv 1/1 Running 0 2d
But what if the bot gets it wrong? What if the actual name of the “training” deployment is “model-maintenance”, and it turns out that “rigd-train” is a service that tracks the departure times at the Caltrain station nearest to the RigD headquarters? No problem, that’s why it’s called “machine learning”. The bot will ask “did I get it right?” the user will say “not even remotely”, and that will be that. Humans understand perfectly well that mistakes are temporary, while bad attitude is permanent, and are far more forgiving of the former than of the latter.
Key points: Always offer to do as much of the work as possible; Never demand information that the bot could obtain by itself; Do not be afraid to make a mistake: users can help fix mistakes over time, but not bad design. As Johann Wolfgang von Göthe would have said, had there been bots in his day, “Es irrt der Bot so lang er strebt.”
Training the bot vs. training the user
Not all bots are natural language capable. There is certainly nothing wrong with deploying services that require highly structured input, as long as the guidelines for composing that input are clear and effortlessly accessible. If your service is a chat-with-CLI, you’re the ultimate authority on the input format, and well within your rights to demand that the users comply with the published rules. They will understand that training is a condition to use the service. Classical UI (both GUI and CLI) are literal examples of human training: to use a system, a human user must be trained to follow the ways of the machine exactly and without deviation (hyperfitting, to coin a term). That’s why humans fluent in one system’s UI perform poorly when attempting to use another without retraining.
However, if your bot claims to feature a natural language interface, it will be judged on a completely different scale. Humans may be trainable and eager to learn new interfaces, but if a bot questions a skill they already have, that’s not going to go well – for the bot. Demanding a specific sentence structure, like “must have a predicate and a subject”, or vocabulary “must include one of our approved keywords” (extra hostility points for not telling the user what those keywords are) are a guaranteed turn-off for most humans. And even if your users are infinitely accommodating and willing to learn the specifics of your “natural” language, what is the more efficient approach: train a single bot to understand intents and entities in the way humans prefer to state them, or train all possible users to shape their thoughts in the way the bot prefers to see them?
The AI “training” is not a one-way process. When a human talks to a bot, the bot gets trained by the labels assigned to its decisions. However, the human also gets trained: by the bot’s “help” system, and ultimately by the success or failure of human’s endeavors with the bot. One of the main goals (and criteria for success) of Bot UI is less training for humans and more – for machines. The “more” and “less” in this context are of course, subjective. After all, virtually all human users undergo a massive training session in their early childhood to acquire those natural language skills. NLU merely frees humans from the need to learn each program’s specific “language.” In absolute terms, the human participation in a dialog with a bot may require a lot of training, but very little of that is specific to a particular bot.
Advice can be helpful, and there are bots that specialize in providing advice. However, unsolicited advice is as irritating coming from a bot as it is when coming from a human (sometimes even more so). Especially when the advice is to go and perform some task that the bot itself can do, or to format input in a way that is more palatable to the bot. Those types of advice are nothing short of user training and are to be avoided.
Key point: The human-bot interaction is a two-way training process, but designers of natural language bots should resist the temptation to train users how to interact with the bot and should instead focus on training the bot how to understand users better. Human users may subconsciously elevate natural language bots to a person status, but that higher position comes with greater responsibility as well.
If the great Asimov were still alive, he could probably come up with a perfect set of rules for chatbot behavior. In his absence though, that task is a fair game for anybody to try. Here is my take on the “laws of chatbotics”:
- A bot is always friendly:
- A bot learns from the users instead of forcing them to learn from it
- A bot that claims to understand natural language doesn’t restrict the grammar, syntax, or vocabulary of the user – it just admits it when it doesn’t understand user’s intent, like a human would.
- A bot never answers user input with silence in a 1:1 conversation
- A bot is always helpful:
- A bot never asks the user for information it can retrieve itself
- A bot is capable of guiding the user through every option of every function it can perform
- A bot never asks the user to choose from a list without providing access to that list
- A bot is not afraid to make a mistake:
- Mistakes are literally opportunities for improvement (by providing more training data)
- Humans are far more forgiving of bot mistakes than of their ugly cousin – computer errors
To be clear, those are not in lieu of, but in addition to Asimov’s original Laws. Legislation governing human behavior has proliferated a lot since the first “I, Robot” story. It’s only fair if 🤖 also get a few extra rules to live by.