ChatGPT implementation series — 1

Dohyeong Kim
2 min readFeb 2, 2023

--

Image from the post of the mobilespoon

Almost all of the world talks about the exceeding performance of ChatGPT, analysing the dataset, model, and training methods of that would be an interesting task. You can find the project that OpenAI used as a major reference to create ChatGPT on GitHub.

However, the code has many subfiles and was written for a distributed training environment using the MPI which makes it impossible to run on a normal desktop like mine.

Therefore, I removed all subfiles and MPI in such milk and modified the code so that you can try sampling and evaluating the summarization each in one Jupyter Notebook conveniently.

Simplifying the code in this way would be useful for training the model from the beginning in the process of the corresponding paper.

You need to install only the Torch, and Blobfile packages to run the code. It will help you grab the essential part of the ‘Learning to summarize from human feedback’ code and paper.

Please note that it takes about 5 minutes and several gigabytes of storage to download the pre-trained model.

Summarize the Reddit post from the pre-trained model

Calculate the reward of summarization from the pre-trained model

Conclusion and Future Work

As you can see from the Notebook results, the summarization results using the provided pre-trained model are quite accurate and almost human-like.

The score of the reward model is also considered to be quite high, and it seems to evaluate the summarization result properly.

Now we check everything works well. In the next post, we will learn how to train the model using Supervised Learning.

--

--

Dohyeong Kim
Dohyeong Kim

Written by Dohyeong Kim

I am a Deep Learning researcher. Currently, I am trying to make an AI agent for various situations such as MOBA, RTS, and Soccer games.

No responses yet