NY Times Considering A Potentially Very Dumb Lawsuit Against OpenAI Because It Learned From NY Times Content

Mike Masnick

from Techdirt on 2023-08-17 20:36 (#6DYE8)

A few weeks ago, the NY Times published a very nice profile piece about me, which starts off with the story of how I recently got pulled into a group chat with a bunch of Hollywood writers, directors, and actors, who were trying to understand how to deal with the rise of generative AI tools. The article recounted how my basic message was that most of the legal routes they were considering weren't likely to be all that effective - especially thinking copyright will save them - but noting that they should be looking to look for ways to embrace the AI and do more with it themselves.

It would appear that the NY Times itself is apparently going in the other direction. According to Bobby Allyn at NPR, the NY Times is considering legal action against OpenAI, claiming that training its models on NY Times content violated the NY Times copyright.

Lawyers for the newspaper are exploring whether to sue OpenAI to protect the intellectual property rights associated with its reporting, according to two people with direct knowledge of the discussions.
For weeks, The Times and the maker of ChatGPT have been locked in tense negotiations over reaching a licensing deal in which OpenAI would pay The Times for incorporating its stories in the tech company's AI tools, but the discussions have become so contentious that the paper is now considering legal action.

This seems like complete nonsense. We've already highlighted how the batch of existing lawsuits in which copyright holders try to sue LLMs for training off their data are likely to fail. But this lawsuit in particular sounds wildly stupid:

A top concern for The Times is that ChatGPT is, in a sense, becoming a direct competitor with the paper by creating text that answers questions based on the original reporting and writing of the paper's staff.

Lol, wut? I mean, the NY Times is considered the top newspaper in the whole damn world, despite tons of competitors, and now it's scared of a bot that is famous for mid-level prose and making shit up? None of that makes sense.

If, when someone searches online, they are served a paragraph-long answer from an AI tool that refashions reporting from The Times, the need to visit the publisher's website is greatly diminished, said one person involved in the talks.

Again, that makes no sense. There are plenty of services out there that already summarize NYT articles and that doesn't violate copyright, because summarizing reporting is clearly fair use. There's no real hot news" doctrine any more.

And, more to the point, if the NY Times is really that scared of ChatGPT, then it seems the NYT's lawyers and execs don't think too highly of all those reporters it has on staff.

Elsewhere, the Verge reports that the NY Times changed its terms to ban" AI tools from training on its articles:

... the NYT updated its Terms of Service on August 3rd to prohibit its content - inclusive of text, photographs, images, audio/video clips, look and feel," metadata, or compilations - from being used in the development of any software program, including, but not limited to, training a machine learning or artificial intelligence (AI) system."

Though, it really sounds like this is more of the NY Times trying to set a trap for OpenAI so it has something to sue over, because the Verge also notes the following:

Despite introducing the new rules to its policy, the publication doesn't appear to have made any changes to its robots.txt - the file that informs search engine crawlers which URLs can be accessed.

OpenAI respects robots.txt. If you truly don't want your content scanned, you put a notation in robots.txt, which takes about 10 seconds tops. If, however, you want to lay a trap so that you can sue OpenAI, then you quietly changes your terms of service, but do nothing to mitigate the problem" of OpenAI scraping, even though you have all the power in your hands.

There's another thing that happened recently in this space, as highlighted by Semafor: the NY Times recently dropped out of a coalition of news orgs trying to demand cash from AI companies.

The New York Times has decided not to join a group of media companies attempting to jointly negotiate with the major tech companies over use of their content to power artificial intelligence.

Again, all of this seems very, very silly. If you don't want AI to train on what you publish, use robots.txt. But AI training on content on the web should never be considered copyright infringing. Again, scanning the web has to be fair use, otherwise we no longer have search engines or a variety of other important tools that all rely on scanning.

I get that legacy news orgs have had a rough time embracing new technology and keep trying to use the law to beat back the tide. But, sooner or later you have to realize that this is just the wrong way to go about everything.

Source	RSS or Atom Feed
Feed Location	https://www.techdirt.com/techdirt_rss.xml
Feed Title	Techdirt
Feed Link	https://www.techdirt.com/