Article 6P8G1 Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos To Train AI

Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos To Train AI

by
msmash
from Slashdot on (#6P8G1)
AI companies are generally secretive about their sources of training data, but an investigation by Proof News found some of the wealthiest AI companies in the world have used material from thousands of YouTube videos to train AI. Companies did so despite YouTube's rules against harvesting materials from the platform without permission. From a report: Our investigation found that subtitles from 173,536 YouTube videos, siphoned from more than 48,000 channels, were used by Silicon Valley heavyweights, including Anthropic, Nvidia, Apple, and Salesforce. The dataset, called YouTube Subtitles, contains video transcripts from educational and online learning channels like Khan Academy, MIT, and Harvard. The Wall Street Journal, NPR, and the BBC also had their videos used to train AI, as did The Late Show With Stephen Colbert, Last Week Tonight With John Oliver, and Jimmy Kimmel Live. Proof News also found material from YouTube megastars, including MrBeast (289 million subscribers, two videos taken for training), Marques Brownlee (19 million subscribers, seven videos taken), Jacksepticeye (nearly 31 million subscribers, 377 videos taken), and PewDiePie (111 million subscribers, 337 videos taken). Some of the material used to train AI also promoted conspiracies such as the "flat-earth theory." Further reading: YouTube Says OpenAI Training Sora With Its Videos Would Break Rules.

twitter_icon_large.pngfacebook_icon_large.png

Read more of this story at Slashdot.

External Content
Source RSS or Atom Feed
Feed Location https://rss.slashdot.org/Slashdot/slashdotMain
Feed Title Slashdot
Feed Link https://slashdot.org/
Feed Copyright Copyright Slashdot Media. All Rights Reserved.
Reply 0 comments