As a company, I am interested in granting my smart speaker or my voice assistant the capability of talking about my business. Indeed my Amazon Echo or my Google Home already come with some default features which is basically what I can read on the box: “Alexa, what is the weather like today in Barcelona?”, “Alexa, tell me a joke!”, “Ok Google, who is Barack Obama?”… This default knowledge belongs to the device, but what happens if I expect my smart speaker to be able to provide support to my customers? Or even to allow them to make purchases? Let’s say I am an airline company and I want my customers to check the estimated time of departure before their flight; or I am an insurance company and I want Alexa to dispense some healthy pieces of advice to my policyholders. Where does this knowledge come from and how can my smart speaker use it?
How can I create my own application, a.k.a. skill or action?
For the smart speaker to manage different knowledge from its default, I have to create my own application. The end-user will then be able to access it thanks to an invocation phrase, for instance “Order a Pizza at Pizzi-Pizzo-Pizza”. According to the device I am using, this kind of application is called a skill on Amazon, or an Action on Google. To cut this short, we are talking about building a piece of code which uses a data set — the knowledge base — and a user interface — the device itself. This knowledge can be fed directly into the Amazon console or the Google console, creating these units of knowledge for each answer I want the smart speaker to give, and for every single one of those answers, as many utterances as possible. The aim of this training process is to teach the matchings between all the potentially asked questions and each unit of knowledge. No need to point out that this is a highly time-consuming and inefficient task. However, if I am already working with a Natural Language Processing (NLP) solution, I have an opportunity of skipping the training process by linking directly my smart speaker application to my NLP technology, thanks to the APIs. This is when Inbenta enters the picture.
Top 4 Most Popular Bot Design Articles:
Let’s start over from creating my smart speaker application. I still have to build a piece of code, but instead of feeding my knowledge into Amazon or Google, my application is going to send any user input (i.e. any question asked to Amazon Echo or Google Home) to Inbenta through the chatbot API. Inbenta is going to understand and process this input (NLU & NLP capabilities) and obtain semantic matchings from the knowledge base. It means that in this scenario, my knowledge base is located in Inbenta’s platform instead of Amazon console or Google console. Once the matching process is completed, Inbenta is returning the correct answer to my application, which will allow the smart speaker to communicate it to the customer. To sum up each role, the smart speaker is in charge of speech recognition and text-to-speech while Inbenta is responsible for NLP matching. The end-user is taking advantage of getting an immediate and relevant answer. Last but not least, as a company, I save the time of training the machine in benefit of building a proper chatbot knowledge base useful for smart speakers but also for web, app, messaging channels, etc.
What do I have to bear in mind when creating a smart speaker application?
An important rule of digital projects is that who provides the user interface has the control of it. In this case, it means that if I want to deploy a skill on Alexa, I have to request Amazon to validate it and Amazon has the last word on deciding whether my skill fulfill its requirements. We mentioned earlier the invocation phrase that allows the user to enter my application. These “magic words” have to be approved by the interface provider, which also makes sure that no invocation phrase can be common to several applications. Another example of Amazon’s policy regarding skills is that my application cannot pretend to be someone different from Alexa. It means that the knowledge unit “Hi, I am Jessica, your Pizzi-Pizzo-Pizza assistant, how can I help you?” would be rejected by Amazon, because Alexa has to be Alexa and cannot be Jessica. Once my application is submitted and compliant with their regulation, Amazon or Google can approve it and make it publicly available for any end-user of the smart speaker.
The second key factor concerns the answers my application is going to give. As the guarantor of the knowledge base, and considering that this knowledge can be common to another chatbot, I also have to keep in mind the specificities of a voice channel: among them, no rich media can be displayed, no hyperlink can be clicked, and my end-users are not willing to listen very long answers for a simple question. However, if my NLP solution supports complex dialogs and decision trees, I can easily create step-by-step journeys through my knowledge base and grant my users a highly human-like conversation, with the added value of interacting through voice instead of text.
At the end of the day, a smart speaker application available on Amazon Echo or Google Home is nothing else but a chatbot, with the only difference that it uses a voice channel instead of the usual text interfaces. As for any chatbot, the key elements to pay attention to are: the knowledge base (the answers or the processes I make available to my customers), the use case (what I want to say to who and why), and the user interface restrictions.
Don’t forget to give us your 👏 !