Article 6K8RA Nvidia sued over AI training data as copyright clashes continue

Nvidia sued over AI training data as copyright clashes continue

by
Ashley Belanger
from Ars Technica - All content on (#6K8RA)
GettyImages-1397564918-800x532.jpg

Enlarge (credit: Yurii Klymko | iStock / Getty Images Plus)

Book authors are suing Nvidia, alleging that the chipmaker's AI platform NeMo-used to power customized chatbots-was trained on a controversial dataset that illegally copied and distributed their books without their consent.

In a proposed class action, novelists Abdi Nazemian (Like a Love Story), Brian Keene (Ghost Walk), and Stewart O'Nan (Last Night at the Lobster) argued that Nvidia should pay damages and destroy all copies of the Books3 dataset used to power NeMo large language models (LLMs).

The Books3 dataset, novelists argued, copied "all of Bibliotek," a shadow library of approximately 196,640 pirated books. Initially shared through the AI community Hugging Face, the Books3 dataset today "is defunct and no longer accessible due to reported copyright infringement," the Hugging Face website says.

Read 15 remaining paragraphs | Comments

External Content
Source RSS or Atom Feed
Feed Location http://feeds.arstechnica.com/arstechnica/index
Feed Title Ars Technica - All content
Feed Link https://arstechnica.com/
Reply 0 comments