Voice and AI Chatbots: Using Intents and Entities in questions
When starting out building AI-driven solutions, including IBM Watson, Google Dialogflow, Amazon Lex, or others, the discussion always starts out with “Intents and Entities”. Not overly difficult, there are lots of descriptions out there on how these things are different. Intents are verbs, Entities are nouns, is the simplest and most common definition.
This all seems fine, and looking at simple examples “I want to buy a ticket to Paris”, it seems pretty easy to understand that intent is “buy a ticket” and the entity is “Paris” in this instance
Then, you actually have to build something that is not trivial! Which is what most real-world scenarios actually are. Using intents vs. entities can become more “gray” – and in fact reasonable people could disagree on the right approach for some use cases.
That the system allows users to use their own words for input, instead of asking them to match choices the VUI designer came up with, is the key to making bot solutions easy to use for customers.
Most solutions start out with an open-ended question for their bot, through either chat or voice. “In a few words, please tell me how I can help” would be a typical opening statement. A slew of intents are built, using lots of sample request phrases, to identify the business action the user needs to perform.
That the system allows users to use their own words for input, instead of asking them to match choices the VUI designer came up with, is the key to making bot solutions easy to use for customers. Well designed systems include the ability to catch entities – nouns – that the user might speak as part of their request, so they don’t have to repeat that information again later. Even better systems are proactive in understanding who the user is, and provide context “are you calling about your backorder?” to help the user.
When the system matches the input to an intent – great! We know what the caller wants to do, and can work to fulfill their request either through continued automation or handing the call off to a human agent when appropriate. But for non-trivial systems a mechanism for better understanding the initial input is needed. The industry calls this “disambiguation” – but really, it’s just follow-up questioning. If you are a car dealership, and the initial question posed is “I want to talk to someone in sales”, there is more work to do to understand which sales department is needed. It might be “new car”, or “used car”, or “parts”, or “accessories”, or “tires”, or something else.
There are a couple ways to approach solving for this problem. The different sales departments can be collected into an Entity. And while this may seem to make sense, my experience is that this is the wrong choice. But like most things with AI, there are no hard and fast rules that always are true.
Defining intents for the different types of sales activities, for this case, allows for a much wider set of sample phrases to be attached to those intents. Things like “I need new tires” are easy to match up to the “tire sales” department when it’s a separate intent. You get the bonus of being able to attach the specific intent at the main menu – and jump right to the specific intent when it’s recognized from the initial input. But, you can collect up the various “sales” related intents behind a sub menu “ok, sales – which sales department would you like to talk to….” and really not change their definition.
Entities work best, and are easiest to understand and maintain, when they can be examined outside of the intents they are contained in.
The golden rule that seems to push people into using intents vs. entities is when the “synonyms” for your entity values stop making sense. Entities work best, and are easiest to understand and maintain, when they can be examined outside of the intents they are contained in. You should not have “words people might say” mixed in as synonyms for entities.
Also, our dialog environments for defining the flow of an interaction handle messaging for matched intents with a lot less work than dynamically handling entities. It is much simpler, and easier to maintain, verbiage for a matched intent then dynamically programming responses based on which entity got matched- or no entity got matched at all.
So, get to work on those intents. Define specific intents at the finest level of granularity that you can. Use Entities when you have specific data elements to collect, but don’t drive “verbs” out of entities. You’ll have intents that can be re-used in different contexts and flows that will be more robust in their ability to match inputs and, most importantly, the system will be easier to understand and maintain going forward.