Tutorial
Requirements - Local Install
Note: Currently, Plan4Dial only supports Linux/WSL due to the restrictions on the RBP planner. Note: If Rasa is giving you issues running, check the latest releases. It may be your version of Python..
venv
.pip install -r requirements.txt
to install the necessary libraries before using.python -m spacy download en_core_web_md
to download the appropriate Spacy model. This is used for NLU along with Rasa.Requirements - Docker
docker build -t plan4dial:latest .
docker run -it --name plan4dial --rm --volume $(pwd)/plan4dial:/root/app/plan4dial --net=host --privileged plan4dial:latest sh
python plan4dial/main.py gold_standard_bot
Note: You can make changes to python files inside subfolder /plan4dial without rebuilding the docker container as it is mounted as as simple volume.
Usage Steps
Create a YAML config file that defines your bot.
Let’s go through an example. Suppose we want to create a chatbot that helps you decide what to do on your day off. This will be a fairly simple bot that picks a restaurant and outing location based on the user’s preferences.
(For later reference, the full YAML file as well as the output files can be found here).
First, create a YAML file and pick a name for the domain:
---
name: day-planner
There are three main parts to the YAML file you will need to define.
They are context variables
, intents
, and actions
. We will examine each of these in turn.
1. Define the Context Variables
Context variables are variables that your bot will store throughout the course of the conversation to keep track of the context gathered so far. Without context variables, the bot would not be able to store information from one line of dialogue to the next.
One of the most common use cases of context variables is to store information gathered from the user. However, they can also be used internally to keep track of states in the conversation, the most common example being an indication of when the “goal” is reached.
In this case, we know we want to keep track of the user’s information and preferences, which include their location, phone number, cuisine, allergy/food restrictions, budget, and outing type. Let’s examine the context variables for location, phone number, cuisine, and food restrictions first.
context_variables:
# user's location
location:
type: json
extraction:
method: spacy
config_method: gpe
known:
type: fflag
init: false
options:
Toronto:
variations:
- downtown
Kingston:
variations:
- k-town
# user's phone number
phone_number:
type: json
extraction:
method: regex
pattern: \d{10}
known:
type: fflag
init: false
examples:
- 1234567890
- 2345678901
- 3453452794
# user's preferred cuisine; must map to one of the 4 options
cuisine:
type: enum
known:
type: fflag
init: false
options:
- Mexican
- Italian
- Chinese
- dessert
# indicates if the user has an allergy
have_allergy:
type: flag
init: false
known:
type: flag
init: false
# food restrictions/allergies that the bot can take into account
food_restriction:
type: enum
known:
type: flag
init: false
options:
- dairy-free
- gluten-free
We can see that each context variable has been assigned a type: json
, enum
, and flag
respectively. We can also see that another type, fflag
, exists under the known
section of cuisine (more on this later).
These are the only four types that we can define in the YAML. They are defined as follows:
type |
definition |
---|---|
flag |
A boolean value; can only be set to |
fflag |
“Fuzzy flag”; can only be set to |
enum |
Can only be set to the values set under the |
json |
Used if you want to use an alternate extraction method. NOTE: Currently, only Spacy and regexes are compatible with
this option. For Spacy, you can optionally add an |
So, “location” is of type json
because we want to use Spacy GPE for location extraction. (In the case of location, it makes the most sense to use a model finely tuned to detect location, instead of Rasa, which is trained only on the examples you provide).
You can see that under extraction
, we specified both the method spacy
and the configuration for NER (named entity recognition), in this case gpe for location.
Note that since we specified cities under “options”, only those extracted location would be viable.
However, if we left those out, any city the user entered would be valid.
We can see that the context variable for “phone_number” is configured similarly, although this one uses a simple regex
, where the pattern is specified under pattern
.
Note that we still supply a few examples
for Rasa’s training process.
cuisine is of type enum
because we only want it to have 4 valid values: Mexican, Italian, Chinese, and dessert. food_restriction is of type enum
for the same reason.
have_allergy, which determines if the user has an allergy (in which case we need to get their food_restriction), is of type flag
. That is, they either do or don’t have an allergy. For this variable, you can also see that it has an init
option. This is only available to flag
or fflag
type variables, and it allows you to set an initial value for the variable and change the initial state of the conversation. In this case, we default the variable to false
.
Each variable also has a known
option which determines the knowledge we have about the variable.
The known
type
can only be set to either flag
or fflag
, and functions in the same way.
This parameter is extermely important as conversation navigation is often predicated on what context we know, maybe know, or don’t know so far.
In most cases, the known
’s init
setting is set to false
, but the type
setting depends on what makes the most sense for the variable.
Often in the case of enum
type variables like cuisine, it makes the most sense to allow for a little variance in user input.
They may something that somewhat resembles one of the available options, and it is helpful to store their answer, classify it as “maybe known”, and clarify the user’s intention.
For simpler variables like have_allergy, a known
type
setting of flag
should suffice.
With this in mind, let’s add the rest of the context variables.
# possible budget options
budget:
type: enum
known:
type: flag
init: false
options:
- low
- high
# user's outing preferences
outing_type:
type: enum
known:
type: fflag
init: false
options:
high-energy:
variations:
- fun
- exciting
- social
low-energy:
variations:
- chill
- relaxing
- laid-back
# activated if there is a conflict between the user's cuisine preference and food restrictions
conflict:
type: flag
init: false
known:
type: flag
init: false
# possible restaurant options
restaurant:
type: enum
known:
type: flag
init: false
options:
- Guac Grill
- Alfredo's Pizza Café
- Mandarin
- Geneva Crepes
# possible outing options
outing:
type: enum
known:
type: flag
init: false
options:
- Stages
- Stauffer Library
- Broadway Theater
- Smith's Golfing Club
# ends the conversation if true
goal:
type: flag
init: false
known:
type: flag
init: false
While most of this you’ve already seen, let’s draw attention to a couple things.
In outing_type, we’ve supplied some variations
under the options
the user can provide.
These indicate that if the user utters any of the variations, the bot will map the user’s utterance back to the original option.
While I’ve only given a few examples for simplicity, it is extremely important to supply lots of training examples to make your model more robust.
There is an exception to this rule, though. In the case of outing, although the variable is of type
enum
, the variable value will be set internally based on the user’s preferences instead of through directly analyzing the user’s input.
Since this will be completely in the control of the bot designer and not reliant on the NLU, no variations need to be provided there.
Also, a flag
goal variable is mandatory for every bot as it determines when the conversation ends.
When you want the outcome of an action to end the conversation, you should set goal to true.
You’re all set to define context variables for your bot! Let’s move on to the next step: intents.
2. Define the Intents
The next step is to define the intents. Intents are characterizations of what the user is trying to say. For example, if the user says “yes”, then their intent is to “confirm” the bot’s statement. Intents are parsed/analyzed using Rasa NLU. They are important as we need to be able to map arbitrary user input to tangible results that determines where to go next in the conversation. NOTE: We do not use Rasa for anything other than off-the-shelf NLU, to reduce dependency on the system.
An intent is made up these parts:
1. utterances: Examples of utterances that constitute that intent.
Similar to context variable variations
, it is best to supply as many of these as you can, as these will be passed off to Rasa as training examples.
Ideally, you shouldn’t have intents with utterances that are too similar to one another, as this will make it harder for the model to pinpoint what the user wants.
2. entities: (Optional) Any entities that are extracted with this intent.
Entities are variables that are extracted from the user.
Within the intent, each entity must be preceded with a $
symbol to indicate the location of the entity in the utterance.
Let’s see what the intents for our day-planner
bot look like:
intents:
confirm:
utterances:
- "yes"
- yeah
- that's it
- "Y"
- mhm
- confirm
- yes please
deny:
utterances:
- "no"
- not at all
- that's not what i meant
- absolutely not
- i don't want that
- nah
- no thanks
- no thank you
share_phone_number:
entities:
- phone_number
utterances:
- My phone number is $phone_number.
- My number is $phone_number.
- $phone_number
share_location:
entities:
- location
utterances:
- I live in $location.
- I am located in $location.
- Can you help me find things to do in $location?
share_cuisine:
entities:
- cuisine
utterances:
- I want to eat $cuisine.
- Do you have restaurants of type $cuisine?
- Are there any $cuisine restaurants in the area?
share_allergies:
entities:
- food_restriction
utterances:
- I have to eat $food_restriction.
- I can only eat foods that are $food_restriction.
- I am allergic to any foods that are not $food_restriction.
share_all_outing_preferences:
entities:
- budget
- outing_type
utterances:
- I have a $budget budget and I would prefer a $outing_type atmosphere today.
- I am operating within a $budget budget and I want to go to a $outing_type place.
- I can do activities with a $budget budget and I want to find the most $outing_type place in the city.
share_budget:
entities:
- budget
utterances:
- I have a $budget budget.
- I am operating within a $budget budget.
- I can do activities with a $budget budget.
share_outing_type:
entities:
- outing_type
utterances:
- I would prefer a $outing_type atmosphere today.
- I want to go to a $outing_type place.
- What is the most $outing_type place in the city?
NOTE: All utterances
must include exactly all the entities listed under entities
; no more, no less.
In practice, this does not mean that all entities will actually be extracted at runtime, but it needs to be indicated
what the intent is actually trying to accomplish.
3. Define the Actions
actions
are the core of dialogue agent design as they specify what your agent can do and when.
We use a declarative specification powered by automated planning that allows you to treat actions as separate pieces of a puzzle.
You won’t have to draw out complex dialogue trees that you will have to completely dismantle if you decide late in the game that you want to add a new action near the top.
Instead, actions are chosen based on what is true in the state of the world.
Only actions whose preconditions
are satisfied are executed.
It is important to reiterate that actions
refer only to the actions that the dialogue agent can take, and that chatbot creation is seen primarily through the lens of the agent’s perspective.
User utterances are only handled by deciphering intents
as described above.
There are four types of actions:
type |
definition |
---|---|
dialogue |
Actions where the agent utters something to the user. Often the user’s intent is extracted, which is then used to determine the outcome. However, the agent can also utter a message without taking any user input. This happens if you only specify a single outcome for a dialogue action as the agent knows it will end up in the same place regardless of what the user says, and so skips getting input entirely. |
system |
Actions that are completely internal the agent, usually changing the value of some context variable based on logic. |
api |
Actions that make API calls, the status of which determines the outcome. NOTE: Still in development. |
custom |
Custom actions created by you, the bot designer. These allow you to create action templates which speeds up action creation. These are written in Python and stored under
These action will end up being one of the above types, but can be configured in a custom way. |
There is also an important subtype you should know.
The Context dependent determination subtype can only be applied to system actions. Using this subtype indicates that you are going to have mini if-elif statements (called contexts) that determine which outcome is executed. This is different than “vanilla”/non-subtyped system actions which don’t check any context when activated and execute the single outcome.
A context is one (or multiple) settings to context variables. For example, some outcome A could depend on location being “Toronto”, while outcome B could depend on time being “12 pm”.
We will see examples of every type (other than api) and subtype in our day-planner
example.
Let’s start by examining a simple dialogue action.
We’ll create an action get-have-allergy
that asks the user if they have an allergy or not, which expects a simple yes/no response.
actions:
get-have-allergy:
type: dialogue
message_variants:
- Do you have any allergies? (Y/N)
condition:
have_allergy:
known: false
effect:
set-allergy:
oneof:
outcomes:
indicate_allergy:
updates:
have_allergy:
value: true
known: true
intent: confirm
follow_up: get-allergy
indicate_no_allergy:
updates:
have_allergy:
known: true
value: false
conflict:
known: true
value: false
intent: deny
We can see that actions take a number of parameters, including type
as discussed above.
message_variants
are messages that the agent can utter when this action takes place.
This parameter can only be supplied for dialogue actions.
You can supply as many messages as you want, and one will be randomly selected at runtime.
The condition
is what you would think of as a “precondition” in automated planning.
Whatever you supply in the condition
is what must be true for the action to take place.
This offers a lot more flexibility than determining a hard-coded sequence of actions through a dialogue tree
as you don’t need to know all the details about where exactly in the conversation the action takes place,
you only need to know in what states it’s allowed to trigger.
This also allows for inserting new actions at any point in development with ease.
In this case, the only condition is that we don’t know if the user has an allergy or not yet.
The effect
is what occurs when the action takes place.
It consists of a name (in this case set-allergy
), followed by oneof
and a list of outcomes
.
As the names suggest, only one of the outcomes will be executed depending on the factors at play.
Each outcome also consists of a name, in this case indicate_allergy
and indicate_no_allergy
.
There are four different parameters that outcomes can take. Outcomes can use multiple and need at least one.
parameters |
definition |
---|---|
updates |
Used in practically every outcome.
Here you define the changing You also define how the This is extremely important to do correctly as “knowing what you know” is a huge part of conversation navigation! NOTE if you want to set the variable to the value taken from the
user, precede the variable name with |
intent |
Used for dialogue actions with > 1 outcome, where the user’s input will be disambiguated. By specifying the intent, you are indicating that this outcome will be the course of action taken when the user’s input matches that intent. |
follow_up |
Forces a particular action to “follow up” this outcome. This is meant to be situational and not used for every single action, in which case you are essentially building a dialogue tree. |
response_variants |
A response, or message, that the bot will utter after the action has been executed. Any one of the variants will be picked at random at runtime. |
context |
Only used in actions with type Specifies what context must be true in order for the outcome to take place. |
With this in mind, we can see that the outcome indicate_allergy
is triggered when the user answers with confirm
.
The updates
indicate that have_allergy
is set to a value of true
and is now known
.
We also force a follow_up
where we try to determine what the user’s allergy is.
In the outcome indicate_no_allergy
, we can see that conflict
is set to a value of false.
This is because we know that if the user has no allergies, we will never come across a conflict between their allergies and their chosen cuisine.
Next, let’s take a look at the actions that actually extract information from the user.
get_outing
, the action where we try to extract both the user’s budget and preference of outing, is the most comprehensive example:
get_outing:
type: custom
subtype: slot_fill
parameters:
action_name: get_outing
entities:
- budget
- outing_type
overall_intent: share_all_outing_preferences
message_variants:
- What kind of outing would you like to go to? Please specify both your budget (high or low) and the type of atmosphere you're looking for (i.e. fun, relaxing, etc.)
fallback_message_variants:
- Sorry, that isn't a valid outing preference.
config_entities:
budget:
fallback_message_variants:
- Sorry, that isn't a valid budget option. Please select either high or low.
single_slot_message_variants:
- What is your budget preference? Please select either high or low.
single_slot_intent: share_budget
outing_type:
fallback_message_variants:
- Sorry, that isn't a valid outing type.
single_slot_message_variants:
- What is your preferred outing type? Use a descriptive adjective like fun, high-energy, relaxing, etc.
single_slot_intent: share_outing_type
clarify_message_variants:
- Sorry, I wasn't quite sure about your outing type preference. Did you want a(n) $outing_type atmosphere?
additional_updates:
- outcome:
budget:
known: true
response_variants:
- Ok, I'll take that into account.
- outcome:
outing_type:
known: true
response_variants:
- Great choice!
We can see that this action is configured quite differently than the rest - this is because it is a custom action.
In this case, the action is built from the slot_fill
template, which is provided by default in Plan4Dial.
This template allows you to extract any number of entities, and even accounts for all the possible combinations of certainties –
i.e. budget is known
and outing_type is maybe
known
, vice versa, etc.
If you go to the source code of the function, you’ll see that the parameters of the custom action are provided under parameters
of get_outing
. A full explanation of what each parameter is can be seen in the documentation for slot_fill
.
The values for location and cuisine are extracted with the same custom action:
get-location:
type: custom
subtype: slot_fill
parameters:
action_name: get-location
overall_intent: share_location
entities:
- location
message_variants:
- Where are you located?
fallback_message_variants:
- Sorry, that isn't a valid location.
additional_updates:
- outcome:
location:
known: true
response_variants:
- Tailoring your results to what's available in $location...
get-cuisine:
type: custom
subtype: slot_fill
parameters:
action_name: get-cuisine
entities:
- cuisine
overall_intent: share_cuisine
message_variants:
- What is your cuisine of choice? Mexican, Italian, Chinese, and dessert restaurants are in the area.
fallback_message_variants:
- Sorry, that isn't a valid cuisine.
config_entities:
cuisine:
clarify_message_variants:
- I didn't quite get your cuisine preference. Do you want to eat $cuisine?
additional_updates:
- outcome:
cuisine:
known: true
response_variants:
- Cuisine preference has been logged.
Next, let’s take a look at a simple system action our bot will use.
reset-preferences:
type: system
condition:
conflict:
known: true
value: true
effect:
reset:
oneof:
outcomes:
reset-values:
updates:
have_allergy:
known: false
food_restriction:
known: false
cuisine:
known: false
conflict:
known: false
response_variants:
- Sorry, but there are no restaurants that match your allergy and cuisine preferences. Try entering a different set of preferences.
We can see that a system action is only concerned with changing the values of some context variables given that a given state is true.
The purpose of this action in particular is to reset the user’s inputs for allergies/food restriction as well as cuisine choice and the conflict flag when a conflict has been detected. The response variants indicate what the bot will tell the user after it performed the action.
Note that since this is a “vanilla” system action, we have only specified one outcome, so the execution of this action is deterministic. We will now see an example where the special subtype of system action uses multiple outcomes.
Let’s take a look at the action check-conflicts
:
check-conflicts:
type: system
subtype: Context dependent determination
condition:
location:
known: true
have_allergy:
known: true
value: true
food_restriction:
known: true
cuisine:
known: true
conflict:
known: false
effect:
check-conflicts:
oneof:
outcomes:
restriction-dessert:
updates:
conflict:
known: true
value: true
context:
cuisine:
value: dessert
food_restriction:
value: dairy-free
restriction-mexican:
updates:
conflict:
known: true
value: true
context:
cuisine:
value: Mexican
food_restriction:
value: gluten-free
no-restriction-1:
updates:
conflict:
known: true
value: false
context:
cuisine:
value: Italian
no-restriction-2:
updates:
conflict:
known: true
value: false
context:
cuisine:
value: Chinese
no-restriction-3:
updates:
conflict:
known: true
value: false
context:
cuisine:
value: dessert
food_restriction:
value: gluten-free
no-restriction-4:
updates:
conflict:
known: true
value: false
context:
cuisine:
value: Mexican
food_restriction:
value: dairy-free
For the sake of making a good example, we have arbitrarily decided that there are two possible conflicts with the user’s choices: there are no gluten-free Mexican restaurants or dairy-free dessert places in the area. With this in mind, we need to check if there’s a conflict with the user’s responses.
The precondition
of check-conflicts
ensures we’ve gathered all the information on location, food restrictions, and cuisine that the user specified.
It also ensures that we don’t know the conflict yet (so we don’t loop back on the same action).
Unlike the first system action example, this action has multiple outcomes.
But without any input from the user (which is only taken in dialogue action),
how will the outcome be chosen? The answer lies in the context
provided in each outcome.
When this type of action is executed, the outcome determiner will run through each outcome and select the one whose context
setting is a subset of the current state of the world.
In this case, that means setting the value to conflict
depending on what combination of input the user entered previously.
NOTE: This specification will become shorter and cleaner with the closing of #4.
And that’s all the action types! Now you have every piece of the puzzle you need to specify your bot. There are a few actions we didn’t cover, but they are all more examples of the above.
Generate the files needed to test the bot with HOVOR.
Once you are satisfied with your specification, call generate_files
.
Then, clone our extension of IBM’s Hovor.
See the Hovor README for a rundown on the different ways to run and deploy your chatbot.