Most people assume that we can train or customise ChatGPT with our own data. But while the end result looks like this, things are fundamentally different. Technically no one can train ChatGPT on your data. OAI doesn’t have an option for it.

At the root of this issue is that each chatgpt thread or API request starts a new conversation, which means the model has no natural memory of conversations.

We can include previous interactions with our prompt to give context but as our conversation gets longer and longer there is a token limit for the number of interactions we can include. Tokens cost money, so this could become quite expensive once we run out of capacity. Result is that GPT starts to lose its memory of old conversations.

All that those “chat with your PDFs” products embed your content and prompt ChatGPT with some sort of metaprompt to give specific contexts to your query, using a technology called Embeddings.

Also, you can do fine-tuning but it’s not a replacement for the combination of semantic search and prompt engineering with ChatGPT which works surprisingly well. It’s not a hack but it’s not training either.

Optimal process:

1) Create a dataset that includes questions and answers that your customers are likely to ask about your business.

2) Now you can fine-tune the ChatGPT model on this data using a process called transfer learning. This involves training the model on your dataset in addition to the large amounts of data it was already trained on. The result is a model that is able to generate responses specific to your business.

Several open-source libraries you can use to fine-tune ChatGPT: Hugging Face’s Transformers library. Cloud-based services: Google Cloud AI Platform or Amazon SageMaker to train and deploy the model.

3) Once you have a fine-tuned model, you can integrate it into your chatbot application. This can be done using APIs provided by the platform you are using to host your chatbot, or by integrating the model directly into your application code.