AI Companies Are Finally Being Forced To Cough Up For Training Data
Arthur T Knackerbracket has processed the following story:
The music industry's lawsuit sends the loudest message yet: High-quality training data is not free.
The generative AI boom is built on scale. The more training data, the more powerful the model.
But there's a problem. AI companies have pillaged the internet for training data, and many websites and data set owners have started restricting the ability to scrape their websites. We've also seen a backlash against the AI sector's practice of indiscriminately scraping online data, in the form ofusers opting out of making their data available for trainingand lawsuits from artists, writers, and theNew York Times, claiming that AI companies have taken their intellectual property without consent or compensation.
Last week three major record labels-Sony Music, Warner Music Group, and Universal Music Group-announced they were suing the AI music companies Suno and Udio over alleged copyright infringement. The music labels claim the companies made use of copyrighted music in their training data at an almost unimaginable scale," allowing the AI models to generate songs that imitate the qualities of genuine human sound recordings.
But this moment also sets an interesting precedent for all of generative AI development. Thanks to the scarcity of high-quality data and the immense pressure and demand to build even bigger and better models, we're in a rare moment where data owners actually have some leverage. The music industry's lawsuit sends the loudest message yet: High-quality training data is not free.
It will likely take a few years at least before we have legal clarity around copyright law, fair use, and AI training data. But the cases are already ushering in changes. OpenAI has been striking deals with news publishers such asPolitico, theAtlantic,Time, theFinancial Times, and others, and exchanging publishers' news archives for money and citations. And YouTube announced in late June that it will offer licensing deals to top record labels in exchange for music for training.
Read more of this story at SoylentNews.