Nvidia sued over AI training data as copyright clashes continue

Ashley Belanger

from Ars Technica - All content on 2024-03-11 16:35 (#6K8RA)

Enlarge (credit: Yurii Klymko | iStock / Getty Images Plus)

Book authors are suing Nvidia, alleging that the chipmaker's AI platform NeMo-used to power customized chatbots-was trained on a controversial dataset that illegally copied and distributed their books without their consent.

In a proposed class action, novelists Abdi Nazemian (Like a Love Story), Brian Keene (Ghost Walk), and Stewart O'Nan (Last Night at the Lobster) argued that Nvidia should pay damages and destroy all copies of the Books3 dataset used to power NeMo large language models (LLMs).

The Books3 dataset, novelists argued, copied "all of Bibliotek," a shadow library of approximately 196,640 pirated books. Initially shared through the AI community Hugging Face, the Books3 dataset today "is defunct and no longer accessible due to reported copyright infringement," the Hugging Face website says.

Read 15 remaining paragraphs | Comments

Source	RSS or Atom Feed
Feed Location	http://feeds.arstechnica.com/arstechnica/index
Feed Title	Ars Technica - All content
Feed Link	https://arstechnica.com/