Reddit Bars Internet Archive From its Website, Sparking Access Concerns

hubie

from SoylentNews on 2025-08-14 08:22 (#6ZAE0)

The communication platform cited suspicions that AI companies were using the archiving site for AI training:

Reddit has announced that it will be severely limiting the Internet Archive's Wayback Machine's access to the communication platform following its accusation that AI companies have been scraping the website for Reddit data. The platform will only be allowing the Internet Archive to save the home page of its website.
The limits on the Internet Archive's access was set to start "ramping up" on Monday, according to the Verge. Reddit did not apparently name any of the AI companies involved in these website data scrapes.
[...] Some Reddit users pointed out that this move is a far cry from Reddit co-founder Aaron Swartz's philosophy. Swartz committed suicide in the weeks before he was set to stand trial for allegedly breaking into an MIT closet to download the paid JSTOR archive, which hosts thousands of academic journals. He was committed to making online content free for the public.
[...] [Reddit spokesman Tim] Rathschmidt emphasized that the change was made in order to protect users: "Until they're able to defend their site and comply with platform policies (e.g., respecting user privacy, re: deleting removed content), we're limiting some of their access to Reddit data to protect redditors," he told Return.
However, it has been speculated that this more aggressive move was financially motivated, given the fact that the platform has struck deals in the past with some AI companies but sued others for not paying its fees. Reddit announced a partnership with OpenAI in May 2024 but sued Anthropic in June of this year for not complying with its demands.

Original Submission

Source	RSS or Atom Feed
Feed Location	https://soylentnews.org/index.rss
Feed Title	SoylentNews
Feed Link	https://soylentnews.org/
Feed Copyright	Copyright 2014, SoylentNews