AIs can generate near-verbatim copies of novels from training data

Melissa Heikkilä, Financial Times

from Ars Technica - All content on 2026-02-23 15:38 (#73S1S)

The world's top AI models can be prompted to generate near-verbatim copies of bestselling novels, raising fresh questions about the industry's claim that its systems do not store copyrighted works.

A series of recent studies has shown that large language models from OpenAI, Google, Meta, Anthropic, and xAI memorize far more of their training data than previously thought.

AI and legal experts told the FT this memorization" ability could have serious ramifications on AI groups' battle against dozens of copyright lawsuits around the world, as it undermines their core defense that LLMs learn" from copyrighted works but do not store copies.

Read full article

Comments

Source	RSS or Atom Feed
Feed Location	http://feeds.arstechnica.com/arstechnica/index
Feed Title	Ars Technica - All content
Feed Link	https://arstechnica.com/