This post will help you get started with Rasa, an open source framework for building contextual virtual assistants. I take an opinionated stand point, sharing what I found worked best in terms of cutting on time and effort. Among other things, I’ll show you how to work around some of the limitations and use this awesome framework to the max.

What follows is meant to get you started with Rasa as quickly as possible, so that you can start building virtual assistants on your own in no time.

TL;DR

Thinking of you, busy developers, here is what you are looking at in terms of contents.

All of the examples can be found on GitLab.

Be Smart about Installing Rasa

I am strongly in favour of using Docker. Mainly to avoid setup issues, but there are other benefits too. First of all, it’s consistency. Your docker image can be deployed to someone else’s machine and function independently from their local environment. Another advantage of using Docker, out of many, is freezing library dependencies. As Rasa evolves, there will be breaking changes, but you might still want to preserve an older version of your chatbot. Hence you could have different Docker images for different Rasa versions.

Without further ado, here is how to install the current latest stable version of Rasa with Docker. Note, how I use alias in order to make it easy to use.

Note, that explicitly passing –user 1000:1000 creates the container under a current user. That’s necessary because files and directories in the Docker container are owned by default owned by root. To avoid running into permission issues when modifying the content, you want to make ensure a non-root ownership.

I’ve created an installation script that allows you to automate the whole process.

If you are not convinced Docker would serve you well, then you can follow YouTube tutorials of how to install Rasa on different OS platforms.

First Steps – Don’t Start from Scratch!

So far, you’ve installed Rasa and initialised the default project. Let’s have a closer look at the init command.

rasa init without any additional parameters will ask you to confirm every single step.

Now let's start! 👇🏽

? Please enter a path where the project will be created [default: current directory] .
Created project directory at '/app'.
Finished creating project structure.
? Do you want to train an initial model? 💪🏽  Yes
Training an initial model...

Since the default options are just fine most of the time, you might as well just silently accept them by adding –no-prompt option.

Another useful option is –init-dir. It comes in handy when you want to create a project elsewhere in your filesystem.

rasa init --init-dir /opt/rasa/myproject

Once you’ve trained your model, you probably want to quickly test a few interactions with the assistant. I like to use the –debug option, because it prints out info about recognised entities.

rasa shell --debug

Transition to HTTP API

Trying the assistant in a command line works for local testing, but ultimately you will want to talk to your bot over a wire. Either because you will want to build a web front-end, or make use of other channels which also run over HTTP.

Rasa provides a turn-key solution:

rasa run -m models --enable-api --log-file out.log

Since we are using Docker, it’s important to ensure port forwarding. I suggest to take it a step further and leverage Docker Compose – docker-compose.yml:

version: '3.0'
services:
  rasa:
    image: rasa/rasa:1.10.1-full
    ports:
      - 5005:5005
    volumes:
      - ./:/app
    command: ["run", "-m", "models", "--enable-api", "--log-file", "out.log"]

Starting the service boils down to simply running:

docker compose up

More importantly though, it’s going to be much easier to add new services and integrations as you develop and grow your virtual assistant.

Now you can test your model by exchanging messages with the Rasa agent over HTTP like this:

curl --location --request POST 'http://localhost:5005/model/parse' \
--header 'Content-Type: application/json' \
--data-raw '{
  "sender": "Rasa",
  "text": "Hi!"
}'

The response is the same you’d experience in the shell, eg. a summary of recognised intents and entities.

{
    "intent": {
        "name": "greet",
        "confidence": 0.9991136193275452
    },
    "entities": [],
    "intent_ranking": [
        {
            "name": "greet",
            "confidence": 0.9991136193275452
        },
        {
            "name": "mood_great",
            "confidence": 0.00038153654895722866
        }
        // etc. 
    ],
    "text": "Hi!"
}

Loading Multiple Models from the Same Project

Imagine you wanted to have a bi-lingual shop assistant. Then you could keep your conversation script in English and Spanish separate and only distinguish the intents by a suffix – “intent: greet” vs “intent: greet_es“. Then you compile a single model and voila .. your bot speaks two languages!

But hang on! Using suffixes can get in a way. Imagine, you are running two businesses, hotel and a restaurant – like in this discussion. Surely, there will be overlapping intents: “greet”, “place order” etc. Another problem could be the size of your model. Perhaps your model is huge and you only ever want to load relevant data into server’s memory. Even worse, when your model contains overlapping intents there will be collisions and your bot’s responses won’t make sense.

What if you genuinely need to load different models on demand?

Well, loading of different models on demand works when parsing the model (credit: Rasa forum):

http://localhost:5005/model/parse?q=hi&project=hotel
http://localhost:5005/model/parse?q=hi&project=restaurant

However ..

No messages will be stored to a conversation and no action will be run. This will just retrieve the NLU parse results.

Source: Rasa Docs

So, how do you make your bot intelligently choose the model according to what the user is saying? You can’t. At least not for now. The core framework doesn’t support loading different models on demand out of the box.

That’s when Rasa’s API comes into the picture. With a little bit of code you can spin up multiple agents and let them each handle a different model.

I’ve prepared a complete example – go check it out.

I tried to come up with the simplest possible solution. There is a single endpoint for exchanging messages back and forth. In production, you’d want to leverage async processing and maintain the conversation state over a web socket. That’s a topic of another time – let me know in the comments section below. I will answer your questions and follow up with another blog post.

Nevertheless, the gist of serving different models in the simplest case is as follows.

First of all, you stop relying on Rasa’s out-of-the-box features. Instead, you switch to using Rasa API internally in your own web server.

version: '3.0'
services:
  web:
    build: .
    ports:
      - "5000:5000"
    command: ["python", "app.py"]
  
  ####
  # THIS IS NO LONGER NEEDED. YOU CAN DELETE THIS ENTIRE SECTION.
  ####
  # rasa:
  #   image: rasa/rasa:1.10.1-full
  #   ports:
  #     - 5005:5005
  #   volumes:
  #     - ./:/app
  #   command: ["run", "-m", "models", "--enable-api", "--log-file", "out.log"]

This approach gives you a greater flexibility and allows you to swap it for a different tool, should there be such a need in the future.

Suppose a bi-lingual chatbot. Once you have written conversation scripts you want to ensure you can easily distinguish between the compiled models. Model names are based on a timestamp by default, but here is how you can enforce your own naming strategy.

rasa train \
--config config/en/config.yml \
--domain config/en/domain.yml \
--data data/en \
--out models \
--fixed-model-name model-en

Result:

/app/models
├── model-en.tar.gz
└── model-es.tar.gz

Now you are left with writing your custom piece of code leveraging Rasa’s agent API.

First, you import the agent and let it load a language model on demand.

from rasa.core.agent import Agent

def __create_agent(lang):
    model_path = '/app/models/model-{}.tar.gz'.format(lang)
    return Agent.load(model_path=model_path)

Next, when starting the web server spin up two agents. Each of them deals with a single language model. Keeping the structure in a dictionary allows you to easily pick the right agent by a language code.

agents = {
    "en": __create_agent("en"),
    "es": __create_agent("es")
}

Now, you just need to expose an endpoint so that you can exchange messages with the bot like this:

http://localhost:5000?message=hi&lang=en

The final piece of code exposes the endpoint, extracts the client message and consults the right agent based on the lang parameter. There is one additional complication with async processing. The agents obviously handle client’s messages asynchronously so that your server can dispatch as many requests as possible. In a production system you want to embrace this pattern and establish a communication channel via a web socket, for example.

However, in our trivial example the communication is synchronous. That’s why we need to introduce a busy loop waiting for the agent’s response.

@app.route('/')
def user_uttered():
    sid = str(request.args.get('sid'))
    message = str(request.args.get('message'))
    lang = str(request.args.get('lang', 'en'))
    agent = agents.get(lang, 'Language {} is not supported'.format(lang))
    bot_response = loop.run_until_complete(
        agent.handle_text(text_message=message, sender_id=sid)
    )
    return ', '.join(map(str, bot_response))

That’s it. The code below is a complete example of what we have just discussed.

In production, you would want to use asynchronous communication over web sockets.

Summary

Thanks for reading to the end and definitely let me know your feedback in the comment section below. I promise I’ll address all of your questions.

So here are the takeaway points:

  • Don’t start from scratch, use rasa init
  • Avoid dependencies on your local environment, use Docker instead
  • Understand the key concepts: intents, entities, actions
  • Single model is supported out of the box
  • It’s not too hard to handle multiple models and load them on demand

Don’t forget to check out the repo and give it a star if you find the examples helpful. Thanks for reading and stay tuned for my next tips in the Rasa series.


Tomas Zezula

Hello! I'm a technology enthusiast with a knack for solving problems and a passion for making complex concepts accessible. My journey spans across software development, project management, and technical writing. I specialise in transforming rough sketches of ideas to fully launched products, all the while breaking down complex processes into understandable language. I believe a well-designed software development process is key to driving business growth. My focus as a leader and technical writer aims to bridge the tech-business divide, ensuring that intricate concepts are available and understandable to all. As a consultant, I'm eager to bring my versatile skills and extensive experience to help businesses navigate their software integration needs. Whether you're seeking bespoke software solutions, well-coordinated product launches, or easily digestible tech content, I'm here to make it happen. Ready to turn your vision into reality? Let's connect and explore the possibilities together.