Tutorial

Requirements - Local Install

Note: Currently, Plan4Dial only supports Linux/WSL due to the restrictions on the RBP planner. Note: If Rasa is giving you issues running, check the latest releases. It may be your version of Python..

For ease of use and reducing the chance of import or versioning errors, it is recommended that you use a virtual environment such as venv.
Run pip install -r requirements.txt to install the necessary libraries before using.
Run python -m spacy download en_core_web_md to download the appropriate Spacy model. This is used for NLU along with Rasa.

Requirements - Docker

To ensure development works identically accross systems, Docker should be used to run Plan4Dial.
Inside the repository, follow these steps to run with Docker:
Retrieve a copy of the rbp.sif executable from Christian Muise (christian.muise@queensu.ca) and place this in the cloned repository.
Build the docker image from Dockerfile: docker build -t plan4dial:latest .
Run a bash session inside a new docker container: docker run -it --name plan4dial --rm --volume $(pwd)/plan4dial:/root/app/plan4dial --net=host --privileged plan4dial:latest sh
Run the command in the bash session: python plan4dial/main.py gold_standard_bot
Now you should see output files were created and successful output in the bash session in the container. You can copy these output files out from the filesystem outside the container, as the /plan4dial subdirectory is mounted.

Note: You can make changes to python files inside subfolder /plan4dial without rebuilding the docker container as it is mounted as as simple volume.

Usage Steps

Create a YAML config file that defines your bot.

Let’s go through an example. Suppose we want to create a chatbot that helps you decide what to do on your day off. This will be a fairly simple bot that picks a restaurant and outing location based on the user’s preferences.

(For later reference, the full YAML file as well as the output files can be found here).

First, create a YAML file and pick a name for the domain:

---
name: day-planner

There are three main parts to the YAML file you will need to define. They are context variables, intents, and actions. We will examine each of these in turn.

1. Define the Context Variables

Context variables are variables that your bot will store throughout the course of the conversation to keep track of the context gathered so far. Without context variables, the bot would not be able to store information from one line of dialogue to the next.

One of the most common use cases of context variables is to store information gathered from the user. However, they can also be used internally to keep track of states in the conversation, the most common example being an indication of when the “goal” is reached.

In this case, we know we want to keep track of the user’s information and preferences, which include their location, phone number, cuisine, allergy/food restrictions, budget, and outing type. Let’s examine the context variables for location, phone number, cuisine, and food restrictions first.

context_variables:
  # user's location
  location:
    type: json
    extraction:
      method: spacy
      config_method: gpe
    known:
      type: fflag
      init: false
    options:
      Toronto:
        variations:
          - downtown
      Kingston:
        variations:
          - k-town
  # user's phone number
  phone_number:
    type: json
    extraction:
      method: regex
      pattern: \d{10}
    known:
      type: fflag
      init: false
    examples:
      - 1234567890
      - 2345678901
      - 3453452794
  # user's preferred cuisine; must map to one of the 4 options
  cuisine:
    type: enum
    known:
      type: fflag
      init: false
    options:
      - Mexican
      - Italian
      - Chinese
      - dessert
  # indicates if the user has an allergy
  have_allergy:
    type: flag
    init: false
    known:
      type: flag
      init: false
  # food restrictions/allergies that the bot can take into account
  food_restriction:
    type: enum
    known:
      type: flag
      init: false
    options:
      - dairy-free
      - gluten-free

We can see that each context variable has been assigned a type: json, enum, and flag respectively. We can also see that another type, fflag, exists under the known section of cuisine (more on this later).

These are the only four types that we can define in the YAML. They are defined as follows:

type	definition
flag	A boolean value; can only be set to `true` or `false`.
fflag	“Fuzzy flag”; can only be set to `true`, `false`, or `maybe`.
enum	Can only be set to the values set under the `options` list.
json	Used if you want to use an alternate extraction method. NOTE: Currently, only Spacy and regexes are compatible with this option. For Spacy, you can optionally add an `options` list which will force that only the listed options are valid extractions. For regex, the `examples` list is necessary so Rasa can capture training examples properly.

So, “location” is of type json because we want to use Spacy GPE for location extraction. (In the case of location, it makes the most sense to use a model finely tuned to detect location, instead of Rasa, which is trained only on the examples you provide). You can see that under extraction, we specified both the method spacy and the configuration for NER (named entity recognition), in this case gpe for location. Note that since we specified cities under “options”, only those extracted location would be viable. However, if we left those out, any city the user entered would be valid.

We can see that the context variable for “phone_number” is configured similarly, although this one uses a simple regex, where the pattern is specified under pattern. Note that we still supply a few examples for Rasa’s training process.

cuisine is of type enum because we only want it to have 4 valid values: Mexican, Italian, Chinese, and dessert. food_restriction is of type enum for the same reason.

have_allergy, which determines if the user has an allergy (in which case we need to get their food_restriction), is of type flag. That is, they either do or don’t have an allergy. For this variable, you can also see that it has an init option. This is only available to flag or fflag type variables, and it allows you to set an initial value for the variable and change the initial state of the conversation. In this case, we default the variable to false.

Each variable also has a known option which determines the knowledge we have about the variable. The known type can only be set to either flag or fflag, and functions in the same way. This parameter is extermely important as conversation navigation is often predicated on what context we know, maybe know, or don’t know so far.

In most cases, the known’s init setting is set to false, but the type setting depends on what makes the most sense for the variable. Often in the case of enum type variables like cuisine, it makes the most sense to allow for a little variance in user input. They may something that somewhat resembles one of the available options, and it is helpful to store their answer, classify it as “maybe known”, and clarify the user’s intention. For simpler variables like have_allergy, a known type setting of flag should suffice.

With this in mind, let’s add the rest of the context variables.

# possible budget options
budget:
  type: enum
  known:
    type: flag
    init: false
  options:
    - low
    - high
# user's outing preferences
outing_type:
  type: enum
  known:
    type: fflag
    init: false
  options:
    high-energy:
      variations:
        - fun
        - exciting
        - social
    low-energy:
      variations:
        - chill
        - relaxing
        - laid-back
# activated if there is a conflict between the user's cuisine preference and food restrictions
conflict:
  type: flag
  init: false
  known:
    type: flag
    init: false
# possible restaurant options
restaurant:
  type: enum
  known:
    type: flag
    init: false
  options:
    - Guac Grill
    - Alfredo's Pizza Café
    - Mandarin
    - Geneva Crepes
# possible outing options
outing:
  type: enum
  known:
    type: flag
    init: false
  options:
    - Stages
    - Stauffer Library
    - Broadway Theater
    - Smith's Golfing Club
# ends the conversation if true
goal:
  type: flag
  init: false
  known:
    type: flag
    init: false

While most of this you’ve already seen, let’s draw attention to a couple things.

In outing_type, we’ve supplied some variations under the options the user can provide. These indicate that if the user utters any of the variations, the bot will map the user’s utterance back to the original option. While I’ve only given a few examples for simplicity, it is extremely important to supply lots of training examples to make your model more robust. There is an exception to this rule, though. In the case of outing, although the variable is of type enum, the variable value will be set internally based on the user’s preferences instead of through directly analyzing the user’s input. Since this will be completely in the control of the bot designer and not reliant on the NLU, no variations need to be provided there.

Also, a flag goal variable is mandatory for every bot as it determines when the conversation ends. When you want the outcome of an action to end the conversation, you should set goal to true.

You’re all set to define context variables for your bot! Let’s move on to the next step: intents.

2. Define the Intents

The next step is to define the intents. Intents are characterizations of what the user is trying to say. For example, if the user says “yes”, then their intent is to “confirm” the bot’s statement. Intents are parsed/analyzed using Rasa NLU. They are important as we need to be able to map arbitrary user input to tangible results that determines where to go next in the conversation. NOTE: We do not use Rasa for anything other than off-the-shelf NLU, to reduce dependency on the system.

An intent is made up these parts:

1. utterances: Examples of utterances that constitute that intent. Similar to context variable variations, it is best to supply as many of these as you can, as these will be passed off to Rasa as training examples. Ideally, you shouldn’t have intents with utterances that are too similar to one another, as this will make it harder for the model to pinpoint what the user wants.

2. entities: (Optional) Any entities that are extracted with this intent. Entities are variables that are extracted from the user. Within the intent, each entity must be preceded with a $ symbol to indicate the location of the entity in the utterance.

Let’s see what the intents for our day-planner bot look like:

intents:
  confirm:
    utterances:
      - "yes"
      - yeah
      - that's it
      - "Y"
      - mhm
      - confirm
      - yes please
  deny:
    utterances:
      - "no"
      - not at all
      - that's not what i meant
      - absolutely not
      - i don't want that
      - nah
      - no thanks
      - no thank you
  share_phone_number:
    entities:
      - phone_number
    utterances:
      - My phone number is $phone_number.
      - My number is $phone_number.
      - $phone_number
  share_location:
    entities:
      - location
    utterances:
      - I live in $location.
      - I am located in $location.
      - Can you help me find things to do in $location?
  share_cuisine:
    entities:
      - cuisine
    utterances:
      - I want to eat $cuisine.
      - Do you have restaurants of type $cuisine?
      - Are there any $cuisine restaurants in the area?
  share_allergies:
    entities:
      - food_restriction
    utterances:
      - I have to eat $food_restriction.
      - I can only eat foods that are $food_restriction.
      - I am allergic to any foods that are not $food_restriction.
  share_all_outing_preferences:
    entities:
      - budget
      - outing_type
    utterances:
      - I have a $budget budget and I would prefer a $outing_type atmosphere today.
      - I am operating within a $budget budget and I want to go to a $outing_type place.
      - I can do activities with a $budget budget and I want to find the most $outing_type place in the city.
  share_budget:
    entities:
      - budget
    utterances:
      - I have a $budget budget.
      - I am operating within a $budget budget.
      - I can do activities with a $budget budget.
  share_outing_type:
    entities:
      - outing_type
    utterances:
      - I would prefer a $outing_type atmosphere today.
      - I want to go to a $outing_type place.
      - What is the most $outing_type place in the city?

NOTE: All utterances must include exactly all the entities listed under entities; no more, no less. In practice, this does not mean that all entities will actually be extracted at runtime, but it needs to be indicated what the intent is actually trying to accomplish.

3. Define the Actions

actions are the core of dialogue agent design as they specify what your agent can do and when. We use a declarative specification powered by automated planning that allows you to treat actions as separate pieces of a puzzle. You won’t have to draw out complex dialogue trees that you will have to completely dismantle if you decide late in the game that you want to add a new action near the top. Instead, actions are chosen based on what is true in the state of the world. Only actions whose preconditions are satisfied are executed.

It is important to reiterate that actions refer only to the actions that the dialogue agent can take, and that chatbot creation is seen primarily through the lens of the agent’s perspective. User utterances are only handled by deciphering intents as described above.

There are four types of actions:

type	definition
dialogue	Actions where the agent utters something to the user. Often the user’s intent is extracted, which is then used to determine the outcome. However, the agent can also utter a message without taking any user input. This happens if you only specify a single outcome for a dialogue action as the agent knows it will end up in the same place regardless of what the user says, and so skips getting input entirely.
system	Actions that are completely internal the agent, usually changing the value of some context variable based on logic.
api	Actions that make API calls, the status of which determines the outcome. NOTE: Still in development.
custom	Custom actions created by you, the bot designer. These allow you to create action templates which speeds up action creation. These are written in Python and stored under `plan4dial/for_generating/custom_actions`. `slot_fill` is a useful example available for use. These action will end up being one of the above types, but can be configured in a custom way.

There is also an important subtype you should know.

The Context dependent determination subtype can only be applied to system actions. Using this subtype indicates that you are going to have mini if-elif statements (called contexts) that determine which outcome is executed. This is different than “vanilla”/non-subtyped system actions which don’t check any context when activated and execute the single outcome.

A context is one (or multiple) settings to context variables. For example, some outcome A could depend on location being “Toronto”, while outcome B could depend on time being “12 pm”.

We will see examples of every type (other than api) and subtype in our day-planner example.

Let’s start by examining a simple dialogue action. We’ll create an action get-have-allergy that asks the user if they have an allergy or not, which expects a simple yes/no response.

actions:
  get-have-allergy:
    type: dialogue
    message_variants:
      - Do you have any allergies? (Y/N)
    condition:
      have_allergy:
        known: false
    effect:
      set-allergy:
        oneof:
          outcomes:
            indicate_allergy:
              updates:
                have_allergy:
                  value: true
                  known: true
              intent: confirm
              follow_up: get-allergy
            indicate_no_allergy:
              updates:
                have_allergy:
                  known: true
                  value: false
                conflict:
                  known: true
                  value: false
              intent: deny

We can see that actions take a number of parameters, including type as discussed above.

message_variants are messages that the agent can utter when this action takes place. This parameter can only be supplied for dialogue actions. You can supply as many messages as you want, and one will be randomly selected at runtime.

The condition is what you would think of as a “precondition” in automated planning. Whatever you supply in the condition is what must be true for the action to take place. This offers a lot more flexibility than determining a hard-coded sequence of actions through a dialogue tree as you don’t need to know all the details about where exactly in the conversation the action takes place, you only need to know in what states it’s allowed to trigger. This also allows for inserting new actions at any point in development with ease.

In this case, the only condition is that we don’t know if the user has an allergy or not yet.

The effect is what occurs when the action takes place. It consists of a name (in this case set-allergy), followed by oneof and a list of outcomes. As the names suggest, only one of the outcomes will be executed depending on the factors at play.

Each outcome also consists of a name, in this case indicate_allergy and indicate_no_allergy.

There are four different parameters that outcomes can take. Outcomes can use multiple and need at least one.

parameters	definition
updates	Used in practically every outcome. Here you define the changing `value` s of context variables. You also define how the `known` status of each variable has updated. This is extremely important to do correctly as “knowing what you know” is a huge part of conversation navigation! NOTE if you want to set the variable to the value taken from the user, precede the variable name with `$`.
intent	Used for dialogue actions with > 1 outcome, where the user’s input will be disambiguated. By specifying the intent, you are indicating that this outcome will be the course of action taken when the user’s input matches that intent.
follow_up	Forces a particular action to “follow up” this outcome. This is meant to be situational and not used for every single action, in which case you are essentially building a dialogue tree.
response_variants	A response, or message, that the bot will utter after the action has been executed. Any one of the variants will be picked at random at runtime.
context	Only used in actions with type `system` and subtype Context dependent determination. Specifies what context must be true in order for the outcome to take place.

With this in mind, we can see that the outcome indicate_allergy is triggered when the user answers with confirm. The updates indicate that have_allergy is set to a value of true and is now known. We also force a follow_up where we try to determine what the user’s allergy is.

In the outcome indicate_no_allergy, we can see that conflict is set to a value of false. This is because we know that if the user has no allergies, we will never come across a conflict between their allergies and their chosen cuisine.

Next, let’s take a look at the actions that actually extract information from the user.

get_outing, the action where we try to extract both the user’s budget and preference of outing, is the most comprehensive example:

get_outing:
  type: custom
  subtype: slot_fill
  parameters:
    action_name: get_outing
    entities:
      - budget
      - outing_type
    overall_intent: share_all_outing_preferences
    message_variants:
      - What kind of outing would you like to go to? Please specify both your budget (high or low) and the type of atmosphere you're looking for (i.e. fun, relaxing, etc.)
    fallback_message_variants:
      - Sorry, that isn't a valid outing preference.
    config_entities:
      budget:
        fallback_message_variants:
          - Sorry, that isn't a valid budget option. Please select either high or low.
        single_slot_message_variants:
          - What is your budget preference? Please select either high or low.
        single_slot_intent: share_budget
      outing_type:
        fallback_message_variants:
          - Sorry, that isn't a valid outing type.
        single_slot_message_variants:
          - What is your preferred outing type? Use a descriptive adjective like fun, high-energy, relaxing, etc.
        single_slot_intent: share_outing_type
        clarify_message_variants:
          - Sorry, I wasn't quite sure about your outing type preference. Did you want a(n) $outing_type atmosphere?
    additional_updates:
      - outcome:
          budget:
            known: true
        response_variants:
          - Ok, I'll take that into account.
      - outcome:
          outing_type:
            known: true
        response_variants:
          - Great choice!

We can see that this action is configured quite differently than the rest - this is because it is a custom action.

In this case, the action is built from the slot_fill template, which is provided by default in Plan4Dial. This template allows you to extract any number of entities, and even accounts for all the possible combinations of certainties – i.e. budget is known and outing_type is maybe known, vice versa, etc.

If you go to the source code of the function, you’ll see that the parameters of the custom action are provided under parameters of get_outing. A full explanation of what each parameter is can be seen in the documentation for slot_fill.

The values for location and cuisine are extracted with the same custom action:

get-location:
  type: custom
  subtype: slot_fill
  parameters:
    action_name: get-location
    overall_intent: share_location
    entities:
      - location
    message_variants:
      - Where are you located?
    fallback_message_variants:
      - Sorry, that isn't a valid location.
    additional_updates:
      - outcome:
          location:
            known: true
        response_variants:
          - Tailoring your results to what's available in $location...
get-cuisine:
  type: custom
  subtype: slot_fill
  parameters:
    action_name: get-cuisine
    entities:
      - cuisine
    overall_intent: share_cuisine
    message_variants:
      - What is your cuisine of choice? Mexican, Italian, Chinese, and dessert restaurants are in the area.
    fallback_message_variants:
      - Sorry, that isn't a valid cuisine.
    config_entities:
      cuisine:
        clarify_message_variants:
          - I didn't quite get your cuisine preference. Do you want to eat $cuisine?
    additional_updates:
      - outcome:
          cuisine:
            known: true
        response_variants:
          - Cuisine preference has been logged.

Next, let’s take a look at a simple system action our bot will use.

reset-preferences:
  type: system
  condition:
    conflict:
      known: true
      value: true
  effect:
    reset:
      oneof:
        outcomes:
          reset-values:
            updates:
              have_allergy:
                known: false
              food_restriction:
                known: false
              cuisine:
                known: false
              conflict:
                known: false
            response_variants:
              - Sorry, but there are no restaurants that match your allergy and cuisine preferences. Try entering a different set of preferences.

We can see that a system action is only concerned with changing the values of some context variables given that a given state is true.

The purpose of this action in particular is to reset the user’s inputs for allergies/food restriction as well as cuisine choice and the conflict flag when a conflict has been detected. The response variants indicate what the bot will tell the user after it performed the action.

Note that since this is a “vanilla” system action, we have only specified one outcome, so the execution of this action is deterministic. We will now see an example where the special subtype of system action uses multiple outcomes.

Let’s take a look at the action check-conflicts:

check-conflicts:
  type: system
  subtype: Context dependent determination
  condition:
    location:
      known: true
    have_allergy:
      known: true
      value: true
    food_restriction:
      known: true
    cuisine:
      known: true
    conflict:
      known: false
  effect:
    check-conflicts:
      oneof:
        outcomes:
          restriction-dessert:
            updates:
              conflict:
                known: true
                value: true
            context:
              cuisine:
                value: dessert
              food_restriction:
                value: dairy-free
          restriction-mexican:
            updates:
              conflict:
                known: true
                value: true
            context:
              cuisine:
                value: Mexican
              food_restriction:
                value: gluten-free
          no-restriction-1:
            updates:
              conflict:
                known: true
                value: false
            context:
              cuisine:
                value: Italian
          no-restriction-2:
            updates:
              conflict:
                known: true
                value: false
            context:
              cuisine:
                value: Chinese
          no-restriction-3:
            updates:
              conflict:
                known: true
                value: false
            context:
              cuisine:
                value: dessert
              food_restriction:
                value: gluten-free
          no-restriction-4:
            updates:
              conflict:
                known: true
                value: false
            context:
              cuisine:
                value: Mexican
              food_restriction:
                value: dairy-free

For the sake of making a good example, we have arbitrarily decided that there are two possible conflicts with the user’s choices: there are no gluten-free Mexican restaurants or dairy-free dessert places in the area. With this in mind, we need to check if there’s a conflict with the user’s responses.

The precondition of check-conflicts ensures we’ve gathered all the information on location, food restrictions, and cuisine that the user specified. It also ensures that we don’t know the conflict yet (so we don’t loop back on the same action).

Unlike the first system action example, this action has multiple outcomes. But without any input from the user (which is only taken in dialogue action), how will the outcome be chosen? The answer lies in the context provided in each outcome.

When this type of action is executed, the outcome determiner will run through each outcome and select the one whose context setting is a subset of the current state of the world.

In this case, that means setting the value to conflict depending on what combination of input the user entered previously.

NOTE: This specification will become shorter and cleaner with the closing of #4.

And that’s all the action types! Now you have every piece of the puzzle you need to specify your bot. There are a few actions we didn’t cover, but they are all more examples of the above.

Generate the files needed to test the bot with HOVOR.

Once you are satisfied with your specification, call generate_files.

Then, clone our extension of IBM’s Hovor.

See the Hovor README for a rundown on the different ways to run and deploy your chatbot.