Blog
The US AI company OpenAI – originally founded as a non-profit organisation – has released several Generative Pretrained Transformer (GPT) systems (GPT, GPT-2, GPT-3) in the past two years, which have received a lot of media attention and are often described as Natural language generation (NLG) systems. However, GPT systems are very different from the NLG approach Retresco uses to develop its solutions. In this interview with our machine learning expert Tobias Günther, we will try to unravel the mystery of GPT-3.
Question: OpenAI has been offering GPT-3, an AI-based text generation service, as a commercial product in the cloud since June 2020. The system’s performance is said to be impressive: if you feed it a short text sample as input, GPT-3 expands this component in terms of content and grammar. As a machine learning expert, how do you rate the system’s capabilities?
Tobias Günther: Without question, GPT-3 has capabilities that other systems have not had in this form so far. There are indeed many examples that are impressive and that even many experts would not have expected to be feasible a short time ago. I am thinking, for example, of the ability to generate meaningful answers to questions that require a certain amount of knowledge of the world or to generate formulated e-mail responses from key points. It is also exciting that the system can generate website designs or guitar tablature in addition to natural language. This shows the generalisation capabilities of the system.
However, one should not forget that many examples are cherry-picked from many generated texts or were generated with human assistance. The system generates the text, but the human being chooses the best variant among several – sometimes meaningless – ones. This is not to say that such a system cannot be useful, but it remains to be seen for which applications GPT-3 can actually be used in practice. The potential, however, is definitely there.
Question: Can you briefly describe – in very simple terms – how such a system works?
Tobias Günther: GPT-3 is a neural network that has been trained as a language model. Language models are systems that are trained to continue a given input. The practical thing about language models is that you need nothing more than a large amount of text to train them. For example, you feed the beginning of a sentence into the system, and depending on whether or not the system correctly predicts the next word in the sentence, the parameters of the neural network are adjusted by a positive or negative signal. The whole thing then happens a few hundred billion times during training.
The novelty of GPT-3 is the unprecedented size of the neural network consisting of 175 billion parameters, which was trained on huge amounts of text consisting of Wikipedia, books and internet crawls. Even though the capabilities of a system of this scale are impressive, one should not forget that it is essentially ‘only’ statistics derived from the training data which merely simulate an actual understanding of the facts.
Question: In what ways does Retresco’s NLG technology – and presumably that of all other NLG providers – differ from the technological approach OpenAI is taking with GPT-3?
Tobias Günther: At Retresco, we are currently working with a template-based approach to NLG, which is based on text modules written by humans. Our linguistic AI components take care of problems such as the correct inflection of words, so that grammatically correct sentences are created when the building blocks are put together.
GPT-3, on the other hand, is an end-to-end system that generates texts completely independently. There are no more intermediate steps or components for subtasks, but only the neural network which generates an output for a given input.
Question: In comparison to GPT-3, what are the advantages and disadvantages of the NLG approaches that Retresco follows? Is it at all possible to define them so simply? Or are there more advantageous areas of application for each system individually?
Tobias Günther: The great advantage of template-based text generation in a business context is the controllability of the approach. It can be guaranteed that the system does not generate abstruse or undesirable sentences. If there are problems, there are possibilities for intervention. GPT-3, on the other hand, is a black box whose generation logic cannot be explained. In business use, these characteristics are very important, especially when it comes to generating large quantities of correct texts to be used without manual control.
I see the application areas of GPT-3 more in human-in-the-loop systems: i.e., use cases in which the generated texts are read again by a human, possibly corrected and then released. A big advantage is, of course, that GPT-3 does not require any initial manual work, such as writing templates. Many interesting applications that support creative tasks are also conceivable.
At this point, we do not yet know what costs will be incurred for the use of GPT-3. It is also unclear whether there will be on-premise solutions for privacy-sensitive application areas. The hardware requirements to operate a neural network with the size of GPT-3 are in any case many times higher than those of a template-based approach. The cost of the one-time training of the model alone was four to five million dollars.
Question: In February 2019, OpenAI decided not to make its new software version GPT-2 freely available. The reason given? The system was too powerful and too dangerous, as it could be misused for the mass production of fake news, for example. As an expert, how do you assess this risk?
Tobias Günther: I see the risk that technology – such as powerful language models – can be misused for harmful purposes. That’s why I think it’s right and important to have discussions about such ethical aspects. This includes, for example, the handling of bias in the training data, which can lead to the production of racist stereotypes. When it comes to the publication of the models, one can certainly find good reasons for as well as against. With regard to communication, however, I would have liked to see a more prudent choice of words.