Had some questions recently on how chatGPT actually works, so in today’s lesson, I thought I’d teach everyone what it is that chatGPT is doing behind the scenes.
So let’s jump in. Here is an example of some text that we might feed chatGPT: ‘The weather is…’. What it’s doing is predicting the next word, or token, that’s most likely to come based on what it’s seen previously. It considers the words it has seen before and predicts the next word based on all of those words previously.
But it’s not just predicting one word. It could be any of several examples, like ’nice’, ‘hot’, ‘cold’, ‘windy’, ‘sunny’, ‘raining’, or ‘snowy’. The prediction is based on the frequency of occurrence of each word in the training data.
The prediction also takes into account the likelihood of each word occurring. For example, ’nice’ might occur 60% of the time, ‘sunny’ 20% of the time, and so on. The temperature parameter determines how much randomness is introduced into the prediction. A higher temperature value allows for more random choices.
The training data for chatGPT is obtained from various sources, such as internet documents and conversational data. The model is trained using a technique called linear regression, which uses mathematical algorithms to predict the next most likely word or token.
When using chatGPT, it’s important to understand that it operates on tokens rather than just words. A token can be a word or a part of a word, like ‘ing’ or ’ed’. The model considers the tokens it has seen before and predicts the next most likely token based on that information.
The temperature parameter plays a crucial role in the prediction process. A temperature of 0 always selects the most likely token, while a higher temperature value allows for more random choices.
In conclusion, chatGPT is a powerful predictive model that uses training data and mathematical algorithms to generate text. It can produce realistic and coherent responses, making it seem like a real person is generating the text. The future of chatGPT looks promising, with potential advancements in multimodal capabilities, such as voice and image recognition.
Overall, understanding how chatGPT works gives us insights into the capabilities and potential of AI-powered language models.