The Most Famous Blunder Of Content Moderation: Do NOT Quote The Princess Bride
We've written stories about people having difficulty recognizing people joking around quoting movies. Sometimes it ends up ridiculously, like the guy who was arrested for quoting Fight Club and had to spend quite some time convincing people he wasn't actually looking to shoot up an Apple store. We've also talked a lot about the impossibility of doing content moderation well at scale. Here's a story where the two collide (though in a more amusing way).
Kel McClanahan is a well known lawyer in national security/FOIA realm, and the other day on Twitter was lucky enough to have a discussion with Cary Elwes, the actor perhaps best known for his role as Westley in the best movie ever made, The Princess Bride. Kel did what one does after getting to have such a discussion, which was to celebrate it on Twitter.
Tonight I found myself unexpectedly in a thread with both @DevinCow and @Cary_Elwes and it's about to give me an aneurysm because I'm having to stifle my normal 1:4 ratio of Princess Bride references to normal sentences because I know he's heard them all.
- National Security Counselors (@NatlSecCnslrs) September 20, 2022
As one does once the discussion turns to The Princess Bride, people start quoting the movie, which remains one of the most quotable movies of all time.
At one point in the ensuing conversation, Kel had a chance to trot out one of those quotable lines:
I'll most likely kill you in the morning.
- National Security Counselors (@NatlSecCnslrs) September 20, 2022
That's Kel saying I'll most likely kill you in the morning," the classic line (SPOILERS!) that Westley says was told to him each night by the Dread Pirate Roberts.
Take a guess what happened next? Yup.

That's Twitter telling Kel that his tweet, quoting The Princess Bride violated its policies on abuse and harassment" and asking him to delete it to get back into his account. Eventually Twitter reversed course and gave Kel his account back.
It's easy to laugh this off (because, well, it is funny). But, it's also a useful lesson in the impossibility of content moderation. In general, absent any context, I'll most likely kill you in the morning" sure could come off as a threatening statement, one that could be seen as abusive or harassing. In many scenarios, that statement would be abusive or harassing and would make users on a social media platform feel unwelcome and threatened.
But, in context, it's quite clear that this is a joke, a quote from a funny movie.
The issue is that so much of content moderation involves context. This is something that critics of content moderation (both those who want more and those who want less) never seem to fully grasp. How does a content moderator (whether AI or human) have enough context to handle all sorts of issues like this? Do you need to train your AI on classic movies? Do you need to make sure that everyone you hire has seen every popular movie and knows them by heart and can recognize when someone is quoting them?
How do you deal with a situation where someone tries to hide behind the quote - but is actually threatening someone? (Not what Kel did here, but just noting, you can't just say okay, leave this line if it's quoting a movie").
The point is that it's ridiculously complicated.
Many people - especially policymakers and the media - seem to think that content moderation is obvious. You take down the bad stuff, and you leave up the good stuff.
But a ridiculous amount of content moderation involves trying to interpret statements where you don't (or, more often, can't) know the actual context. Is the comment between friends joking around? Is the comment made to be threatening? Is there a deeper meaning behind it? Is it quoting a movie? What if it's an inside joke between people?
These things are not easy because there is no easy answer.
And that includes do nothing." Because if you do nothing at all you end up in a world in which the world's worst people embrace that to legitimately threaten and harass people.
This is why I keep saying content moderation is impossible to do well. It's not that people aren't trying hard enough. It's that it's literally impossible to fully understand all the context. And it's silly to expect that anyone can.
I asked Kel if he had any thoughts on all this, and here's his take on the whole thing:
I'm bemused that Twitter's AI suspended me for a comment that the purported target of the threat" liked and responded to, but I guess if I'm going to go to Twitter Jail for posting a movie quote, the best way to do so is with the OG movie star in the thread.