In early 2019, a learning-based technique appeared that could perform common natural language processing operations, for instance, answering questions, completing text, reading comprehension, summarization, and more. This method was developed by scientists at OpenAI, and they called it GPT-2.
The goal was to be able to perform this task with as little supervision as possible. This means that they unleashed this algorithm to read the internet, and the question is, what would the AI learn during this process?
That is a tricky question.
And to be able to answer it, let's have a look at the paper from 2017, where an AI was given a bunch of Amazon product reviews and the goal was to teach it to be able to generate new ones or continue a review when given one. Then, something unexpected happened. The finished neural network used surprisingly few neurons to be able to continue these reviews, and upon closer inspection, they noticed that the neural network has built up a knowledge of not only language but also built a sentiment detector as well. This means that the AI recognized that in order to be able to continue a review, it not only needs to learn English but also needs to be able to detect whether the review seems positive or not. If we know that we have to complete a review that seems positive from a small snippet, we have a much easier time doing it well.
And now, back to GPT-2. As it was asked to predict the next character in sentences of not reviews, but of any kind, we asked what this neural network would learn? Well, now we know that of course, it learns whatever it needs to learn to perform the sentence completion properly. And to do this, it needs to learn English by itself, and that’s exactly what it did! It also learned about a lot of topics to be able to discuss them well.
And then, the next version was released, by the name GPT-3. This version is now more than a 100 times bigger, so our first question is, how much better can an AI get if we increase the size of a neural network?
Let’s have a look together.
These are the results on a challenging reading comprehension test as a function of the number of parameters. As you see, around 1.5 billion parameters, which is roughly equivalent to GPT-2, it learned a great deal, but its understanding is nowhere near the level of human comprehension.
However, as we grow the network, something incredible happens. Non-trivial capabilities start to appear as we approach a hundred billion parameters.
Look!
It nearly matched the level of humans. My goodness!
This was possible before, but only with neural networks that are specifically designed for a narrow task. In comparison, GPT-3 is much more general. Let’s test that generality and have a look at 5 practical applications together!
5 Practical Applications of GPT-3
One, OpenAI made this AI accessible to a lucky few people, and it turns out, it has read a lot of things on the internet, which contains a lot of code, so it can generate website layouts from a written description.
Two, it also learned how to generate properly formatted plots from a tiny prompt written in plain English.
Not just one kind - many kinds!
Perhaps to the joy of technical PhD students around the world, three, it can properly typeset mathematical equations from a plain English description as well.
Four, it understands the kind of data we have in a spreadsheet, in this case, population, and fills the missing parts correctly.
And five, it can also translate a complex legal text into plain language, or, the other way around, in other words, it can also generate legal text from our simple descriptions.
However, of course, this iteration of GPT also has its limitations. For instance, we haven’t seen the extent to which these examples are cherrypicked, or in other words, for every good output that we marvel at, there might have been one, or a dozen tries that did not come out well. We don’t exactly know.
But the main point is that working with GPT-3 is a really peculiar process where we know that a vast body of knowledge lies within, but it only emerges if we can bring it out with properly written prompts. It almost feels like a new kind of programming that is open to everyone, even people without any programming or technical knowledge. If a computer is a bicycle for the mind, then GPT-3 is a fighter jet.
Absolutely incredible.
And to say that the paper is vast would be an understatement - we only scratched the surface of what it can do here, so make sure to have a look if you wish to know more about it.
I can only imagine what we will be able to do with GPT-4 and GPT-5 in the near future!
What a time to be alive!
Thanks!
Good Good
ReplyDelete