(1) According to the function classification, it can be divided into: question-and-answer chat robot, task chat robot, and small chat chat robot.
The technologies used by chat bots with different functions are also different. For example, when doing question-and-answer chat bots, we need to extract the focus vocabulary in the question to search in triples or knowledge graphs, and in order to improve the accuracy of retrieval , It is often necessary to classify questions and relations. However, for small chat chatbots, you can directly treat them as a sequence labeling problem, throw high-quality data into the deep learning model for training, and finally get the target model.
(2) According to the mode classification, it can be divided into: retrieval mode and generative mode.
a. Based on the retrieval model, it uses a database of predefined responses and some heuristic reasoning to select the appropriate response based on the input and context. In other words, the FAQ is constructed, the question-answer pair is stored, and then the answer to the sentence is returned from the FAQ by retrieval. These systems do not generate any new text, they just select a response from a fixed set. This method has obvious advantages and disadvantages. Due to the use of hand-built repositories, retrieval-based methods will not produce grammatical errors. However, they may not be able to handle scenarios without predefined responses. For the same reason, these models cannot refer to contextual entity information, such as the names mentioned earlier.
b. Generative model. This method is more difficult. It does not rely on predefined responses and generates new responses from scratch. Generative models are usually based on machine translation technology, but instead of translating from one language to another, they are "translation" from input to output (response).
seq2seq model This method has obvious advantages and disadvantages. It can refer to entities in the input, giving the impression that you are talking to someone. However, these models are difficult to train, and are likely to have grammatical errors (especially on longer sentences), and usually require a lot of training data.
(3) According to the field classification, it can be divided into: open field and closed field.
a. Chatbots in the open field are more difficult to achieve, because users do not necessarily have clear goals or intentions. Conversations on social media sites like Twitter and Reddit are usually open field-they can talk about any topic in any direction. The countless topics and the scale of knowledge required to generate reasonable responses make it quite difficult to implement open-domain chatbots. At the same time, this also requires an open domain knowledge base as its knowledge reserve, which increases the difficulty of information retrieval.
b. A chatbot in a closed field is easier to implement, and the possible input and output space is limited because the system tries to achieve a very specific goal. Technical support or shopping assistants are examples of closed-field problems. These systems don’t need to talk about politics, they just need to accomplish specific tasks as efficiently as possible. Of course, users can still have conversations wherever they want, but the system does not need to deal with all of these situations-users do not expect to do that.