GPT 3

What is GPT3 #

GPT3 is a machine learning algorithm which has come to attention in the last year due to its apparent ability to interpret written instructions and perform complex tasks with little or no task specific training.

What does it do #

GPT3 is the 3rd generation of the GPT or Generative Pre-Training Language Model. It was presented by OpenAI in the paper https://arxiv.org/abs/2005.14165 - “Language Models are Few-Shot Learners”. Like the earlier GPT models it uses the Transformer architecture, taking in a stream of text and generating a new stream of text as a result. The core algorithm is trained to continue the text that was input - like a mobile’s autocorrect feature - but taken to such a level that it effectively parses information from the complex and ambiguous ways people use language. This ability can then be adapted to other tasks such as translation, answering freeform questions and document summarisation.

Where to Access GPT3 #

[[OpenAI]] provide access to use the GPT3 algorithm through an API. The same API can also be provided through Microsoft Azure Cognitive Services where greater security and data privacy guarantees can be delivered. There are several alternative implementations based on similar algorithms such as GPT-NeoX from EleutherAI, many of these can be accessed through a repository managed by Hugging Face or from APIs like GooseAI.

Why is it important #

The original idea of generative pre-training was to that if a machine learning model could be trained to reproduce realistic text based on a very large quantity of training data, that model could be adapted to more useful tasks without much training. The second version GPT2 scaled this model up, and found that the model could already perform well on several language processing tasks without any additional training. GPT3 scaled the model further, in terms of the volume of data used for training, the length of text used as input, and the model depth. The increased model depth increases increases the complexity of operations which the model can perform, they found that the model could perform well at many tasks and could learn some tasks by example or from a description of the task rather than by retraining the model itself.

There have been many examples released showing how well the algorithm can perform, one of the more impressive demos shows it generating simple jsx or react applications based on descriptions of the tasks. The algorithm is now used to power GitHub Copilot, a service which automatically completes code as you type.

Limitations #

A significant limitation is in the lack of transparency in the model internals. Many abilities appear to have been discovered by accident - and results remain unpredictable.

Risks and Safety #

Starting from GPT2 OpenAI stopped releasing access to trained models along with the papers describing them. The primary stated motivation for this was the risks around harmful use of the technology, and to set a standard for researchers in general to take greater responsibility for the consequences of releasing similar models. The current

The papers identify a number of risks in how the model could be used. These include making existing harmful activities such as spam, phishing and fraud more effective because it is difficult for people to recognize text that has been artificially generated. The model may enable new techniques for “bad actors”. Many of these are around increasing the scale of existing risks not new to the technology itself, since humans are already good at generating realistic text.

Significant issues have been identified around how the model can reflect bias from the training data - for example assuming gender based on occupation or reinforcing racial stereotypes.