At Haptik we design, develop and test chatbots every day. These bots are spread over multiple domains a few examples being feedback bots, supports bots, lead gen bots, etc. Some of these bots are text-based, some voice-based and some support a mix of both.
Our objective is not just to make bots live in production, but also to make them robust and fail-proof. The onus is on the QA (Quality Analyst) team to deliver live bots with minimal, or zero, errors in production. Hence, for QA’s, it is a huge responsibility to ship these chatbots with minimal errors and maintain quality.
Now, you guys must be wondering, what exactly does chatbot testing mean. To give you some background first, a chatbot is a machine built with a purpose to help its users over text or voice. Since the medium of communication is very open-ended, the users can say anything they feel like. At this point, the QA needs to figure out if the bot should have been able to reply to the users’ query and if so was it the right reply. A few things that go into this are:
– Testing of end-end user flows for the user queries without failing. — Validating the business logic the bot executed — General Responses (Greeting, positive feedbacks, negative feedbacks etc.) — Testing functionality of Individual Tasks (tasks are nothing but use-cases supported by chatbot)
– Checking of UI of chat elements, images, Grammar, Personality, etc.
Bots can be viewed or understood as a simple flow chart. The user sends a message, the message is processed on the basis of various conditions, and the best possible response is then sent back to the user.
To understand what a simple bot flow looks like, let’s consider the above example where the user says- ”Set wake up reminder”. The following happens on receiving the message:
1. The sentence goes through our Machine Learning pipeline (NLU engine, consisting of our backend ML microservices which are responsible for detecting the intent, entities, correcting spelling, grammar and understanding what the user wants)
2. Once we know that, the Pipeline traverses all the way to “wake up reminder” through “set reminder”.
3. On selecting the respective and most relevant response, bot replies: ”Help me with date and time” (Response is taken care of by our ML Response Engine).
Challenges While Testing a bot?
1. Wide Scope of testing
- Unlike other types of testing, chatbot testing depends on multiple factors-domain for which the chatbot is built, target users, target age group, type of conversation that a chatbot is supposed to have with the user (Casual or professional). Keeping all these factors in mind, there is a large number of permutations and combinations to think off. Building test cases for each of these scenarios is again a tricky and tedious task.
- While testing the chatbots, the tester also has to focus on replies given by the Chatbot for various general queries. (User typing anything in a free form)
- The response that was given by the bot when it does not understand the user queries.
2. Time consumption and increased manpower
Let’s understand this with the help of the previous example. Here we have. A bot builder made the following changes –
separated the date and time intents as shown below:
While testing a bot for change made in flow, the main objective to keep in mind is to not break the existing flows. Slightest of changes to a bot, as in the above case, the tester has to test all three reminder types (exercise reminder, wake up reminder and drink water reminder). This leads to the testing of the whole flow again. This consumes a lot of time and manpower.
3. API failures
Chatbot testing also involves testing of third-party integration API. This also includes responses from chatbot for API integration i.e -success, failures and timeout scenarios.
4. Report generation
Chatbots performance tracking is one of the major challenges nowadays. There is no tool in the market which can track changes made to the bot and give a summary and detail report about test cases before and after. Not just that but also have this data available at any time.
Introduction to Bot Testing Tool
To overcome these day to day challenges and deploy a high-quality bot in the shortest possible time, we knew that only automation can come to our rescue. And that was the birth of our Bot testing tool. This tool can be run on demand, and it also runs automatically through a cron script in our UAT environment every day, to make sure the bots are running as per requirement. So, let’s take a look at how exactly the Bot testing tool works and how it has made our life easier.
There are 4 important steps that are followed in order to get the bot testing tool up and running. The Following is an overview of what exactly the tool does:
- Perform the chatbot flow manually once on your app or website bot.
This includes sending messages to the bot, checking the bot response and whether it is responding to all the chats and free form queries. Remember that every bot is built to automate specific business critical processes.
- The user input sheet gets created with all the user queries/messages that one sent while manually testing the bot. Input sheet is just a simple CSV with various columns related to the chatbot flow.
- The above input sheet is then passed as an input to the next step to get the messages from the system against the user queries, which in turn generates the System Messages sheet.
- Once the above System sheet is created, it is simply run every day to check if the correct responses are sent to the user and if the response generated is the most relevant one.
If the above is true, then the test cases are marked as passed else the bot QA tool raises an exception and sends an email with the failure scenarios.
“The beauty of this tool is that it is independent of the copy of the messages.”
This tool can be executed through Jenkins, on the click of a single button. We just have to take the system message sheet URL for a particular bot from the S3 folder, add it to the field called file URL and provide user_id (Unique to a user) of the account on which we want to run the bot testing tool.
Thereafter, we just have to click the build button and the test cases start executing. The messages that are sent by the system can be seen on the devices of the user, whose user_id is mentioned. Through that, the user can come to know if the chatbot is working as intended or not. The Bot QA tool actually simulates a real user chatting on the platform.
This tool runs automatically every day at with help of a cron script. For each chatbot, the system message sheet gets executed and if there are any failures, a mail gets triggered to the stakeholders of that particular chatbot. The mail received is as shown below:
All these reports are available on S3 and can be accessed whenever needed to check the previous state. This helps in debugging issues very easily.
“Bot testing tool runs every night starting from 1 am and gives the result for 25 bots in approximately 2 hours”
Without this tool, testing and deploying 25 Bot’s would take more than 3–4 days.
- Bot builders / Developers changing flow breaks the tool, so communication is key whenever things are changing.
- To leverage this bot testing tool for Voice bot testing.
- Build some functionality around this tool to support Multilingual chatbot testing.
- Optimizing time to run bot testing tool from several hours to minutes.
- Time taken to test any chatbot has reduced to 7–8 mins from 1–2 hrs.
- Bot testing tool is capable of testing multiple chatbots all at once
- One time effort to test chatbot manually.
- A single bot can be running for multiple clients. This also allows testing the bot across clients.
We would love to hear from you. Do let us know your feedback for our Bot testing tool. Our team is working vigorously to improve and build more features and we will be back with the updates soon. Haptik is hiring. Do check out our careers page.
Originally published at haptik.ai on March 15, 2019.
Automating Bot Testing at Haptik | Haptik Tech Blog was originally published in Chatbots Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.