Nvidia Denies Pirate e-Book Sites Are 'Shadow Libraries' To Shut Down Lawsuit
An anonymous reader quotes a report from Ars Technica: Some of the most infamous so-called shadow libraries have increasingly faced legal pressure to either stop pirating books or risk being shut down or driven to the dark web. Among the biggest targets are Z-Library, which the US Department of Justice has charged with criminal copyright infringement, and Library Genesis (Libgen), which was sued by textbook publishers last fall for allegedly distributing digital copies of copyrighted works "on a massive scale in willful violation" of copyright laws. But now these shadow libraries and others accused of spurning copyrights have seemingly found an unlikely defender in Nvidia, the AI chipmaker among those profiting most from the recent AI boom. Nvidia seemed to defend the shadow libraries as a valid source of information online when responding to a lawsuit from book authors over the list of data repositories that were scraped to create the Books3 dataset used to train Nvidia's AI platform NeMo. That list includes some of the most "notorious" shadow libraries -- Bibliotik, Z-Library (Z-Lib), Libgen, Sci-Hub, and Anna's Archive, authors argued. However, Nvidia hopes to invalidate authors' copyright claims partly by denying that any of these controversial websites should even be considered shadow libraries. "Nvidia denies the characterization of the listed data repositories as 'shadow libraries' and denies that hosting data in or distributing data from the data repositories necessarily violates the US Copyright Act," Nvidia's court filing said. The chipmaker did not go into further detail to define what counts as a shadow library or what potentially absolves these controversial sites from key copyright concerns raised by various ongoing lawsuits. Instead, Nvidia kept its response brief while also curtly disputing authors' petition for class-action status and defending its AI training methods as fair use. "Nvidia denies that it has improperly used or copied the alleged works," the court filing said, arguing that "training is a highly transformative process that may include adjusting numerical parameters including 'weights,' and that outputs of an LLM may be based, at least in part, on such 'weights.'" "Nvidia's argument likely depends on the court agreeing that AI models ingesting published works in order to transform those works into weights governing AI outputs is fair use," notes Ars. "However, authors have argued that 'these weights are entirely and uniquely derived from the protected expression in the training dataset' that has been copied without getting authors' consent or providing authors with compensation." "Authors suing Nvidia have taken the next step, linking the chipmaker to shadow libraries by arguing that 'these shadow libraries have long been of interest to the AI-training community because they host and distribute vast quantities of unlicensed copyrighted material. For that reason, these shadow libraries also violate the US Copyright Act.'"
Read more of this story at Slashdot.