Note: After publishing this article, OpenAI announced that has released ChatGPT API. ChatGPT API will maintain context. However, the model behind it (gpt-3.5-turbo) does not support fine-tuning. Therefore, for fine-tuned models and advanced use cases, the instructions in this tutorial are still valid and helpful.
The Problem
GPT is a generative text model which means that produces new text by predicting what comes next based on the input it gets from the user. The model was trained on a large corpus of text (books, articles, and websites) and it used this data and learned patterns and relationships between words and phrases.
By default, the model has no memory when you initiate a discussion with it. This means each input is treated independently without any context or information carried over from the previous user prompts. This is certainly not ideal for human-friendly interactions. While this seems like a limitation, it actually allows the model to generate more diverse and less repetitive text.
In some cases, carrying over context is useful and necessary. Some techniques like fine-tuning on a specific topic help in improving the quality of outputs, the other technique that we are going to implement next is much easier to implement.
No Context = Chaos of Randomness
Let’s start by building a simple chatbot, initially, we are going to initiate a discussion as our goal is to compare the outputs of the model now and later when we add more context to the conversation.