Article 4JJVW Computerphile explains the fascinating AI storyteller, GPT-2

Computerphile explains the fascinating AI storyteller, GPT-2

by
Mark Frauenfelder
from on (#4JJVW)
Story Image

GPT-2 is a language model that was trained on 40GB of text scraped from websites that Reddit linked to and that had a Karma score of at least two. As the developers at OpenAI describe it, GPT-2 is "a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization-all without task-specific training." Because the model is probabilistic, it returns a different response every time you enter the same input.

OpenAI decided not to release the 40GB-trained model, due to "concerns about malicious applications of the technology" but it released a 345MB-trained model which you can install as a Python program and run from a command line. (The installation instructions are in the DEVELOPERS.md file.) I installed it and was blown away by the human-quality outputs it gave to my text prompts. Here's an example - I prompted it with the first paragraph of Kafka's The Metamorphosis. And this is just with the tiny 345MB model. OpenAI published a story that the 40G GPT-2 wrote about unicorns, which shows how well the model performs.

In this Computerphile video, Rob Miles of the University of Nottingham explains how GPT-2 works.

External Content
Source RSS or Atom Feed
Feed Location http://boingboing.net/rss
Feed Title
Feed Link http://boingboing.net/
Reply 0 comments