Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / HTML

Chatbot Tutorial

4.85/5 (88 votes)
24 Apr 2019CPOL27 min read 749.2K   217  
Tutorial on making an artificial intelligence chatbot

Overview

This is a step by step guide to implement your own Artificial Intelligence chatbot.

Table of Contents

  1. Introduction - Chatbot Description (First Example)
  2. Introducing Keywords and Stimulus Response
  3. Preprocessing the User's Input and Repetition Control
  4. A More Flexible Way for Matching the Inputs
  5. Using Classes for a Better Implementation
  6. Controlling Repetition Made by the User
  7. Using "States" to Represent Different Events
  8. Keyword Boundaries Concept
  9. Using Signon Messages
  10. "Keyword Ranking" Concept
  11. Keyword Equivalence Concept
  12. Transposition and Template Response
  13. Keyword Location Concept
  14. Handling Context
  15. Using Text to Speech
  16. Using a Flat File to Store the Database
  17. A Better Repetition Handling Algorithm
  18. Updating the Database With New Keywords
  19. Saving the Conversation Logs
  20. Learning Capability

Introduction

Basically, a chatterbot is a computer program that when you provide it with some inputs in Natural Language (English, French ...) responds with something meaningful in that same language. Which means that the strength of a chatterbot could be directly measured by the quality of the output selected by the Bot in response to the user. By the previous description, we could deduce that a very basic chatterbot can be written in a few lines of code in a given specific programming language. Let's make our first chatterbot (notice that all the codes that will be used in this tutorial will be written in C++. Also, it is assumed that the reader is familiar with the STL library.) This tutorial is also available in the following languages: Java, Visual Basic, C#, Pascal, Prolog and Lisp.

C++
//
// Program Name: chatterbot1
// Description: This is a very basic example of a chatterbot program
//
// Author: Gonzales Cenelia
//

#include <iostream>
#include <string>
#include <ctime>

int main()
{
    std::string Response[] = {
        "I HEARD YOU!",
        "SO, YOU ARE TALKING TO ME.",
        "CONTINUE, I’M LISTENING.",
        "VERY INTERESTING CONVERSATION.",
        "TELL ME MORE..."
    };

    srand((unsigned) time(NULL));

    std::string sInput = "";
    std::string sResponse = "";

    while(1) {
        std::cout << ">";
        std::getline(std::cin, sInput);
        int nSelection = rand() % 5;
        sResponse = Response[nSelection];
        std::cout << sResponse << std::endl;
    }

    return 0;
}

As you can see, it doesn't take a lot of code to write a very basic program that can interact with a user but it would probably be very difficult to write a program that would really be capable of truly interpreting what the user is actually saying and after that, would also generate an appropriate response to it. These have been a long term goal since the beginning and even before the very first computers were created. In 1951, the British mathematician Alan Turing came up with the question, "Can machines think" and he has also proposed a test which is now known as the Turing Test. In this test, a computer program and also a real person is set to speak to a third person (the judge) and he has to decide which of them is the real person. Nowadays, there is a competition that was named the Loebner Prize and in this competition bots that have successfully fooled most of the judges for at list 5 minutes would win a prize of 100.000$. So far no computer program was able to pass this test successfully. One of the major reasons for this is that computer programs written to compute in such contests naturally have the tendency of committing a lot of typos (they are often out of the context of the conversation). Which means that generally, it isn't that difficult for a judge to decide whether he is speaking to a "computer program" or a real person. Also, the direct ancestor of all those programs that tries to mimic a conversation between real human beings is Eliza, the first version of this program was written in 1966 by Joseph Weizenbaum a professor of MIT.

Chatbots in general are considered to belong to the weak AI field (weak artificial intelligence) as opposed to strong A.I. whose goal is to create programs that are as intelligent as humans or more intelligent. But it doesn't mean that chatbots do not have any true potential. Being able to create a program that could communicate the same way humans do would be a great advancement for the AI field. Chatbot is this part of artificial intelligence which is more accessible to hobbyists (it only takes some average programming skill to be a chatbot programmer). So, programmers out there who wanted to create true AI or some kind of artificial intelligence, writing intelligent chatbots is a great place to start!

Now, Let's Get Back to Our Previous Program, What Are the Problems With It?

Well, there is a lot of them. First of all, we can clearly see that the program isn't really trying to understand what the user is saying but instead he is just selecting a random response from his database each time the user type some sentence on the keyboard. And also, we could notice that the program repeats himself very often. One of the reasons for this is because of the size of the database which is very small (5 sentences). The second thing that would explain the repetitions is that we haven't implemented any mechanism that would control this unwanted behavior.

How do we move from a program that just select responses randomly to whatever input that the user might enter on the keyboard to a program that shows some more understanding of the inputs?

The answer to that question is quite simple; we simply need to use keywords.

A keyword is just a sentence (not necessarily a complete one) or even a word that the program might recognize from the user's input which then makes it possible for the program to react to it (example: by printing a sentence on the screen). For the next program, we will write a knowledge base or database, it will be composed of keywords and some responses associated to each keyword.

So, now, we know what to do to improve "our first chatterbot" and make it more intelligent. Let’s proceed on writing "our second bot", we will call it chatterbot2.

C++
//
// Program Name: chatterbot2
// Description: this is an improved version
// of the previous chatterbot program "chatterbot1"
// this one will try a little bit more to understand what the user is trying to say
//
// Author: Gonzales Cenelia
//

#pragma warning(disable: 4786)

#include <iostream>
#include <string>
#include <vector>
#include <ctime>

const int MAX_RESP = 3;

typedef std::vector<std::string> vstring;

vstring find_match(std::string input);
void copy(char *array[], vstring &v);


typedef struct {
    char *input;
    char *responses[MAX_RESP];
}record;

record KnowledgeBase[] = {
    {"WHAT IS YOUR NAME", 
    {"MY NAME IS CHATTERBOT2.",
     "YOU CAN CALL ME CHATTERBOT2.",
     "WHY DO YOU WANT TO KNOW MY NAME?"}
    },

    {"HI", 
    {"HI THERE!",
     "HOW ARE YOU?",
     "HI!"}
    },
    
    {"HOW ARE YOU",
    {"I'M DOING FINE!",
    "I'M DOING WELL AND YOU?",
    "WHY DO YOU WANT TO KNOW HOW AM I DOING?"}
    },

    {"WHO ARE YOU",
    {"I'M AN A.I PROGRAM.",
     "I THINK THAT YOU KNOW WHO I'M.",
     "WHY ARE YOU ASKING?"}
    },

    {"ARE YOU INTELLIGENT",
    {"YES,OFCORSE.",
     "WHAT DO YOU THINK?",
     "ACTUALY,I'M VERY INTELLIGENT!"}
    },

    {"ARE YOU REAL",
    {"DOES THAT QUESTION REALLY MATERS TO YOU?",
     "WHAT DO YOU MEAN BY THAT?",
     "I'M AS REAL AS I CAN BE."}
    }
};

size_t nKnowledgeBaseSize = sizeof(KnowledgeBase)/sizeof(KnowledgeBase[0]);

int main() {
    srand((unsigned) time(NULL));

    std::string sInput = "";
    std::string sResponse = "";

    while(1) {
        std::cout << ">";
        std::getline(std::cin, sInput);
        vstring responses = find_match(sInput);
        if(sInput == "BYE") {
            std::cout << "IT WAS NICE TALKING TO YOU USER, SEE YOU NEXTTIME!" << std::endl;  
            break;
        } 
        else if(responses.size() == 0)  {
            std::cout << "I'M NOT SURE IF I  UNDERSTAND WHAT YOU  ARE TALKING ABOUT." 
                      << std::endl;
        }
        else {
            int nSelection = rand()  % MAX_RESP;
            sResponse =   responses[nSelection]; std::cout << sResponse << std::endl; 
        } 
    } 

    return 0;
}
    
// make a  search for the  user's input 
// inside the database of the program 
vstring find_match(std::string  input) { 
    vstring result;
    for(int i = 0; i < nKnowledgeBaseSize;  ++i) {  
        if(std::string(KnowledgeBase[i].input) == input) { 
            copy(KnowledgeBase[i].responses, result); 
            return result;
        } 
    } 
    return result; 
}

void copy(char  *array[], vstring &v) { 
    for(int i = 0;  i < MAX_RESP; ++i) {
        v.push_back(array[i]);
    }
}

Now, the program can understand some sentences like "what is your name", "are you intelligent", etc. And also he can choose an appropriate response from his list of responses for this given sentence and just display it on the screen. Unlike the previous version of the program (chatterbot1), Chatterbot2 is capable of choosing a suitable response to the given user input without choosing random responses that doesn't take into account what actually the user is trying to say.

We’ve also added a couple of new techniques to these new programs: when the program is unable to find a matching keyword for the current user input, it simply answers by saying that it doesn't understand which is quite human like.

What Can We Improve on these Previous Chatbots to Make It Even Better?

There are quite a few things that we can improve, the first one is that since the chatterbot tends to be very repetitive, we might create a mechanism to control these repetitions. We could simply store the previous response of that Chatbot within a string sPrevResponse and make some checkings when selecting the next bot response to see if it's not equal to the previous response. If it is the case, we then select a new response from the available responses.

The other thing that we could improve would be the way that the chatbot handles the users inputs, currently if you enter an input that is in lower case, the Chatbot would not understand anything about it even if there would be a match inside the bot's database for that input. Also, if the input contains extra spaces or punctuation characters (!;,.) this also would prevent the Chatbot from understanding the input. That's the reason why we will try to introduce some new mechanism to preprocess the user’s inputs before it can be searched into the Chatbot database. We could have a function to put the users inputs in upper case since the keywords inside the database are in uppercase and another procedure to just remove all of the punctuations and extra spaces that could be found within users input. That said, we now have enough material to write our next chatterbot: "Chattebot3".

What Are the Weaknesses With the Current Version of the Program?

Clearly, there are still many limitations with this version of the program. The most obvious one would be that the program uses "exact sentence matching" to find a response to the user's input. This means that if you would go and ask him "what is your name again", the program will simply not understand what you are trying to say to him and this is because it was unable to find a match for this input. And this definitely would sound a little bit surprising considering the fact that the program can understand the sentence "what is your name".

How Do We Overcome This Problem?

There are at least two ways to solve this problem, the most obvious one is to use a slightly more flexible way for matching keywords in the database against the user's input. All we have to do to make this possible is to simply allow keywords to be found within the inputs so that we will no longer have the previous limitation.

The other possibility is much more complex, it uses the concept of Fuzzy String Search. To apply this method, it could be useful at first to break the inputs and the current keyword in separate words, after that we could create two different vectors, the first one could be used to store the words for the input and the other one would store the words for the current keyword. Once we have done this, we could use the Levenshtein distance for measuring the distance between the two word vectors. (Notice that in order for this method to be effective, we would also need an extra keyword that would represent the subject of the current keyword).

So, there you have it, two different methods for improving the chatterbot. Actually, we could combine both methods and just selecting which one to use on each situation.

Finally, there is still another problem that you may have noticed with the previous chatterbot, you could repeat the same sentence over and over and the program wouldn't have any reaction to this. We need also to correct this problem.

So, we are now ready to write our fourth chatterbot, we will simply call it chatterbot4. View the code for Chatterbot4.

As you may have probably seen, the code for "chatterbot4" is very similar to the one for "chatterbot3" but also there were some key changes in it. In particular, the function for searching for keywords inside the database is now a little bit more flexible. So, what next? Don’t worry; there are still a lot of things to be covered.

What Can We Improve in Chatterbot4 to Make It Better?

Here Are Some Ideas

  • Since the code for the chatterbots have started to grow, it would be a good thing to encapsulate the implementation of the next chatterbot by using a class.
  • Also, the database is still much too small to be capable of handling a real conversation with users, so we will need to add some more entries in it.
  • It may happen sometimes that the user will press the enter key without entering anything on the keyboard, we need to handle this situation as well.
  • The user might also try to trick the chatterbot by repeating his previous sentence with some slight modification, we need to count this as a repetition from the user.
  • And finally, pretty soon, you will also notice that we might need a way for ranking keywords when we have multiple choices of keywords for a given input, we need a way for choosing the best one among them.

That said, we will now start to write the implementation for chatterbot5. Download Chatterbot5.

Before proceeding to the next part of this tutorial, you are encouraged to try compiling and running the code for "chatterbot5" so that you can understand how it works and also to verify the changes that have been made in it. As you may have seen, the implementation of the "current chatterbot", is now encapsulated into a class, also, there have been some new functions added to the new version of the program.

We Will Now Try to Discuss the Implementation of "Chatterbot5"

  • select_response(): This function selects a response from a list of responses, there is a new helper function that was added to the program shuffle, this new function shuffles a list of strings randomly after seed_random_generator() was called.
  • save_prev_input(): This function simply saves the current user input into a variable (m_sPrevInput) before getting some new inputs from the user.
  • void save_prev_response(): The function save_prev_response() saves the current response of the chatterbot before the bot has started to search responses for the current input, the current response is save in the variable (m_sPrevResponse).
  • void save_prev_event(): This function simply saves the current event (m_sEvent) into the variable (m_sPrevEvent). An event can be when the program has detected a null input from the user also, when the user repeats himself or even when the chatterbot makes repetitions as well, etc.
  • void set_event(std::string str): Sets the current event (m_sEvent)
  • void save_input(): Makes a backup of the current input (m_sIntput) into the variable m_sInputBackup.
  • void set_input(std::string str): Sets the current input (m_sInput)
  • void restore_input(): Restores the value of the current input (m_sInput) that has been saved previously into the variable m_sInputBackup.
  • void print_response(): Prints the response that has been selected by the chat robot on the screen.
  • void preprocess_input(): This function does some preprocessing on the input like removing punctuations, redundant spaces characters and also it converts the input to uppercase.
  • bool bot_repeat(): Verifies if the chatterbot has started to repeat himself.
  • bool user_repeat(): Verifies if the user has repeated himself/herself.
  • bool bot_understand(): Verifies that the bot understands the current user input (m_sInput).
  • bool null_input(): Verifies if the current user input (m_sInput) is null.
  • bool null_input_repetition(): Verifies if the user has repeated some null inputs.
  • bool user_want_to_quit(): Check to see if the user wants to quit the current session with the chatterbot.
  • bool same_event(): Verifies if the current event (m_sEvent) is the same as the previous one (m_sPrevEvent).
  • bool no_response(): Checks to see if the program has no response for the current input.
  • bool same_input(): Verifies if the current input (m_sInput) is the same as the previous one (m_sPrevInput).
  • bool similar_input(): Checks to see if the current and previous input are similar, two inputs are considered similar if one of them is the substring of the other one (e.g.: how are you and how are you doing would be considered similar because how are you is a substring of how are you doing.
  • void get_input(): Gets inputs from the user.
  • void respond(): Handles all responses of the chat robot whether it is for events or simply the current user input. So, basically, this function controls the behaviour of the program.
  • find_match(): Finds responses for the current input.
  • void handle_repetition(): Handles repetitions made by the program.
  • handle_user_repetition(): Handles repetitions made by the user.
  • void handle_event(std::string str): This function handles events in general.

You can clearly see that "chatterbot5" has much more functionalities than "chatterbot4" and also each functionalities is encapsulated into methods (functions) of the class CBot but still there are a lot more improvements to be made on it too.

Chattebot5 introduces the concept of "state", in this new version of the Chatterbot, we associate a different "state" to some of the events that can occur during a conversation. Example, when the user enters a null input, the chatterbot would set itself into the "NULL INPUT**" state, when the user repeats the same sentence, it would go into the "REPETITION T1**" state, etc.

Also, these new chatterbots use a bigger database than the previous chatbot that we have seen so far: chatterbot1, chatterbot2, chatterbot3... But still, this is quiet insignificant due to the fact that most chatterbots in use today (the very popular ones) have a database of at least 10000 lines or more. So, this would definitely be one of the major goals that we might try to achieve into the next versions of the chatterbot.

But however for now, we will concentrate a little problem concerning the current chatterbot.

What Exactly Would Be This Problem?

Well, it's all about keyword boundaries, suppose the user enters the sentence: "I think not" during a conversation with the chatbot, naturally the program would look into his database for a keyword that would match the sentence, and it might find the keyword: "Hi", which is also a substring of the word "think", clearly this is an unwanted behaviour.

How Do We Avoid It?

Simply by putting a space character before and after the keywords that can be found inside the database or we can simply apply the changes during the matching process inside the "find_match() function".

Are There Other Things That We Can Improve in "Chatterbot5"?

Certainly there are. So far, the Chatbot start a "chatting session" with the users without saying anything at the beginning of the conversations. It would be good if the chatterbot could say anything at all to startup the conversations. This can easily be achieved by introducing "sign on messages" into the program. We can simply do this by creating a new state inside the Chatbot "knowledge base" and by adding some appropriate message that links to it. That new state could be called "SIGNON**".

Introducing the Concept of "Keyword Ranking"

As you can see, on each new version of the chatterbot, we are progressively adding new features in order to make the Chabot more realistic. Now, in these section, we are going to introduce the concept of 'keyword ranking' into the Chatterbot. Keyword ranking is a way for the program to select the best keywords in his database when there are more than one keywords that match the users inputs. Example: if we have the current user input: What is your name again, by looking into his database, the Chatbot would have a list of two keywords that match this input: 'WHAT' and 'WHAT IS YOUR NAME'. Which one is the best? Well, the answer is quite simple, it is obviously: 'What is your name' simply because it is the longest keyword. This new feature has been implemented in the new version of the program: Chatterbot7.

Equivalent Keywords

Within all the previous Chatterbots, the record for the database allowed us to use only one keyword for each set of responses but sometimes it could be useful to have more than one keyword associated to each set of responses. Specially, when these keywords have the same meaning, e.g.: What is your name and can you please tell me your name both have the same meaning? So there would be no need to use different records for these keywords, instead we can just modify the record structure so that it allowed us to have more than one keyword per record.

Keyword Transposition and Template Response

One of the well known mechanisms of chatterbots is the capacity to reformulate the user's input by doing some basic verb conjugation. Example, if the user enters: YOU ARE A MACHINE, the chatterbot might respond: So, you think that I'm a machine.

How did we arrive at this transformation? We may have done it by using two steps:

  • We make sure that the chatterbot has a list of response templates that is linked to the corresponding keywords. Response templates are a sort of skeleton to build new responses for the chatterbot. Usually, we used wildcards in the responses to indicate that it is a template. On the previous example, we have used the template: (so, you think that*) to construct our response. During the reassembly process, we simply replace the wildcard by some part of the original input. In that same example, we have used: You are a machine, which is actually the complete original input from the user. After replacing the wildcard by the user's input, we have the following sentence: So, you think that you are a machine but we cannot use this sentence as it is, before that we need to make some pronoun reversal in it.

  • The usual transpositions that we use mostly are the replacement of pronoun of the first person to pronoun of the second person, e.g.: you -> me, I'm -> you are, etc. In the previous example by replacing "YOU ARE" by "I'M" in the users input, After applying these changes, the original sentence becomes: I'm a machine. Now we can replace the wildcard from the template by these new sentence which gives us our final response for the Chatbot: So, you think that I'm a machine.

Notice that it's not a good thing to use transposition too much during a conversation, the mechanism would become too obvious and it could create some repetition.

Keyword Location Concept

Some keywords can be located anywhere in a given input, some others can only be found in only some specific places in the user's input otherwise it wouldn't make any sense. A keyword like: "Who are you" can be found anywhere on the user's input without creating any problems with the meaning of it.

Some examples of sentences using "WHO ARE YOU" would be:

  1. Who are you?
  2. By the way, who are you?
  3. So tell me, who are you exactly?

But a keyword such as "who is" can only be found at the beginning or in the middle of a given sentence but it can not be found at end of the sentence or alone.

Examples of sentences using the keyword: "who is":

  1. Who is your favorite singer?
  2. Do you know who is the greatest mathematician of all time?
  3. Tell me, do you know who is? (this clearly doesn't make any sense)

How do we make sure that the chatterbot will be able to distinguish such keywords and the specific places were they are aloud to be found on a sentence? We will simply introduce some new notations for keywords:

  1. Keywords that can only be found at the beginning or in the middle of a sentence will be represented by: _KEYWORD (Ex: _WHO IS)
  2. Keywords that can only be found at end or in the middle of a sentence will be denoted by: KEYWORD_ (WHAT ARE YOU_)
  3. Keywords that should only be found alone in a sentence will be represented by: _KEYWORD_ (Ex: _WHAT)
  4. And finally, keywords that can be found anywhere in a sentence or even alone would be simply represented by: KEYWORD (Ex: I UNDERSTAND)

A keyword can have different meanings depending on its position in a given sentence.

Handling Context

Context is a way for the Chatterbot to keep track of what it has said previously and being able to take this into account when selecting his next response. So far, every response selected by the Chatbot during a conversation is chosen only based on the current user's input. But sometimes, we might need more data in order to be able to respond properly to a given input, that's when we need to use context.

To illustrate this concept, we are going to look at the following conversation log:

USER: What is your favorite movie?
CHATTERBOT: IT IS TERMINATOR II.

USER: Why do you like this movie? (Now how are we supposed to answer that question if we knew nothing about the previous response of the Chatbot?)

So clearly, some inputs requires the usage of "context" in order to be able to formulate a correct answer. In the previous example, it would simply be: IT IS TERMINATOR II. Now the Bot knows what it was talking about previously, it can more easily formulate a good answer to the user's input.

We can now continue the previous conversation log:

(Context: IT IS TERMINATOR II)
CHATTERBOT: BECAUSE IT IS A SCIENCE-FICTION MOVIE AND I LOVE SCIENCE-FICTION.

Context also allowed us to control improper reaction from the Chatbot. Example, if the user enters the sentence: "Why do you like these movie?" during a conversation without the Chatterbot even talking about these subject. It could simply respond by saying: WHAT ARE YOU TALKING ABOUT?

The context feature has been implemented in Chatterbot11.

Another great feature that would be very interesting to implement into a Chatterbot is the capacity to anticipate the next response of the user, these would make the Chatbot looks even more smarter during a conversation.

Using Text To Speech

Wouldn't it be great if your computer could speak back to you whenever you order it to do something, we've accomplished just that in "Chatterbot12" the latest version of the program. Now the program can speak out every answer that it has selected after examining the user's input. The SAPI library from Microsoft was used in order to add the "Text To Speech" feature within the program. For the implementation part, three new functions were added to the program to implement the "Text To Speech" functionality: Initialize_TTS_Engine(), speak(const std::string text), Release_TTS_Engine().

  • Initialize_TTS_Engine(): This function, as the name suggests, initialized the "Text To Speech Engine" that is, we first start by initializing the "COM objects" since SAPI is built on top of the ATL library. If the initialization was successful, we then create an instance of the ISpVoice object that controlled the "Text To Speech" mechanism within the SAPI library by using the CoCreateInstance function. If that also was successful, it means that our "Text To Speech Engine" was initialized properly and we are now ready for the next stage: speak out the "response string"
  • speak (const std::string text): So, this is the main function that is used for implementing "Text To Speech" within the program, it basically takes the "response string" converted to wide characters (WCHAR) and then passes it to the "Speak method" of the "ISpVoice" object which then speaks out the "bot's response".
  • Release_TTS_Engine(): Once we are done using the "SAPI Text To Speech Engine", we just release all the resources that have been allocated during the procedure.

Using a Flat File to Store the Database

So far, the database was always built into the program which means whenever you modified the database, you would also have to recompile the program. This is not really convenient because it might happen sometimes that we only want to edit the database and keep the rest of the program as it is. For this reason and many others, it could be a good thing to have a separate file to store the database which then gives us the capability of just editing the database without having to recompile all the files in the program. To store the database, we could basically use a simple text file with some specific notations to distinguish the different elements of the database (keywords, response, transpositions, context, ...). In the current program, we will use the following notations that have been used before some implementation of the Eliza chatbot in Pascal.

  1. Lines that starts by "K" in the database will represent keywords.
  2. Lines that starts by "R" will represent responses
  3. Lines that starts by "S" will represent sign on messages
  4. Lines that starts by "T" will represent transpositions
  5. Lines that starts by "E" will represent possible corrections can be made after transposing the user's input
  6. Lines that starts by "N" will represent responses for empty input from the user
  7. Lines that starts by "X" will represent responses for when that chatbot did not find any matching keyword that match the current user input.
  8. Lines that starts by "W" will represent responses for when the user repeat itself.
  9. Lines that starts by "C" will represent the context of the chatbot's current response.
  10. Lines that starts by "#" will represent comments

We now have a complete architecture for the database, we just need to implement these features into the next version of the chatbot (Chatterbot13).

A Better Repetition Handling Algorithm

In an effort to prevent the chatbot from repeating itself too much, previously we have used a very basic and simple algorithm that consists of comparing the current chatbot's response to the previous one. If the current response selection is equal to the previous one, we simply discard that response and look over for the next response candidate on the list of available responses. This algorithm is very efficient when it comes to control immediate repetitions from the chatbot. However, it's not that good to avoid more long term repetition. During a chatting session, the same response can occur many times. With the new algorithm, we control how long it takes for the chatbot to reselect the same response. Actually, we make sure that it has used all available responses for the corresponding keyword before it can repeat the same response. This, in turn, can improve the quality of the conversation exchanges. Here is a decryption on how the algorithm works: During the conversation between the chatbot and the user, we make a list of all the responses previously selected by the chat robot. When selecting a new response, we make a search of then current selected response inside the list starting from the end. If the current response candidate was found during that search within the list, we then make a comparison of that position the total number of available responses. If the position plus one is inferior to the total of available responses, we consider that it is a repetition, so we have to discard the current response and select another one.

Updating the Database With New Keywords

Sometimes, when it comes to add new keywords to the database, it could be difficult to choose those that are really relevant. However, there is a very simple solution to that problem. When chatting with the chat robot, we just make sure that we store the user's input in a file (example: unknown.txt) each time the chatbot was not able to find any matching keyword for the current input. Later on, when we need to make some keyword updates in the database, we just have to take a look at the file that we've used to save the unknown sentences found earlier during the previous conversations. By continuously adding new keywords using these procedures, we could create a very good database.

Download Chatterbot15

Saving the Conversation Logs

Why saving the conversations between the users and the chatbot? Because it could help us find the weakness of the chatbot during a given conversation. We might then decide on which modifications to make to the database in order to make the future conversation exchanges more natural. We could basically save the time and also the date to help us determine the progress of the chatbot after new updates were applied to it. Saving the logs helps us determine how human like is the conversation skill of the chatbot.

Learning Capability

So far, the chatbot was not able to learn new data from the users while chatting, it would be very useful to have this feature within the chatbot. It basically means that whenever the chatbot encounters an input that has no corresponding keyword, it would prompt the user about it. And in return, the user would be able to add a new keyword and the corresponding response to it in the database of the chat robot, doing so can improve the database of the chatbot very significantly. Here is how the algorithm should go:

  1. NO KEYWORD WAS FOUND FOR THIS INPUT, PLEASE ENTER A KEYWORD
  2. SO THE KEYWORD IS: (key)
  3. (if response is no)PLEASE REENTER THE KEYWORD (go back to step #2)
  4. NO RESPONSE WAS FOUND FOR THIS KEYWORD: (key) , PLEASE ENTER A RESPONSE
  5. SO, THE RESPONSE IS: (resp)
  6. (if response is no) PLEASE REENTER THE RESPONSE (go back to step #4)
  7. KEYWORD AND RESPONSE LEARNED SUCCESSFULLY
  8. IS THERE ANY OTHER KEYWORD THAT I SHOULD LEARN
  9. (if response is yes, otherwise continue chatting): PLEASE ENTER THE KEYWORD (go back to step #2)

Return to beginning of the document.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)