You put all the effort into writing your own chatbot with Rasa. You spent hours or even days on training your model and testing. Now you want to make it publicly accessible and keep the cost down. Where shall I run the action server? What if my bot is multilingual and I need to spin up additional agents on demand? This post addresses these concerns and shows you how to package your solution as if it was a simple web application.

If you haven’t followed this series .. A while ago, I wrote a coffee bot – go check it out and explore the repo. It is a prototype and demo of a multilingual chatbot.

Supporting a single model is a piece of cake

Initially, I didn’t think much about the deployment. I was excited the framework (Rasa) conveniently provided essential components out of the box. Starting the agent and getting the custom actions going was as easy as running rasa run and rasa run action. I only had to write a bit of routing logic to handle client – server communication via web socket. Obviously, I still had to implement the frontend, but that’s a topic for another post. Stitching it all together was fairly quick and easy, it looked like this:

Rasa chatbot as a web application

Multiple models threaten scalability

Multiple languages means multiple trained models. That’s when leveraging ready-to-use components came at a cost. As a quick prototype I simply started multiple instances of my backend and it worked fine. Except, it wouldn’t scale. A picture is worth thousand words:

Convenient, but naive – a standalone web server per trained model. It won’t scale.

Programmatic approach – load your agents on demand!

This realisation made me delve into Rasa Agent API and soon enough I was able to spin up a new agent on demand. Programatically. Within a single instance of a web server!

It doesn’t seem like much, but this subtle shift in my approach to instantiating multiple agents allowed me to come up with a fairly slick design shown below.

A scalable approach to having multiple language models.t

Having a single web server made the my bot easily deployable. I looked into different platforms, including Google and Amazon. In the end, I decided to use Heroku for its cost efficiency and simplicity.

Have you struggled with getting your bot publicly accessible? Have you solved a similar challenge differently? Is there anything in particular you would like to know? Let me know your experience and questions in the comment section below.

Thanks for reading, get in touch and stay tuned for my next post, where I show you how to package and configure your Rasa bot for a deployment on Heroku’s free tier.


Tomas Zezula

Hello! I'm a technology enthusiast with a knack for solving problems and a passion for making complex concepts accessible. My journey spans across software development, project management, and technical writing. I specialise in transforming rough sketches of ideas to fully launched products, all the while breaking down complex processes into understandable language. I believe a well-designed software development process is key to driving business growth. My focus as a leader and technical writer aims to bridge the tech-business divide, ensuring that intricate concepts are available and understandable to all. As a consultant, I'm eager to bring my versatile skills and extensive experience to help businesses navigate their software integration needs. Whether you're seeking bespoke software solutions, well-coordinated product launches, or easily digestible tech content, I'm here to make it happen. Ready to turn your vision into reality? Let's connect and explore the possibilities together.