Facebook's spam filter blocked the most popular articles about its 50m user breach
When news broke yesterday that Facebook had suffered a breach affecting at least 50,000,000 users, Facebook users (understandably) began to widely share links to articles about the breach.
The articles were so widely and quickly shared that they triggered Facebook's spam filters, which blocked the most popular stories about the breach, including an AP story and a Guardian story.
There's no reason to think that Facebook intentionally suppressed embarrassing news about its own business. Rather, this is a cautionary tale about the consequences of content filtering on big platforms.
Facebook's spam filter is concerned primarily with stopping spam, not with allowing through storm-of-the-century breaking news headlines that everyone wants to share. On a daily basis, Facebook gets millions of spams and (statistically) zero stories so salient that every Facebook user shares them at once. Any kind of sanity-check on a spam filter that allowed through things that appeared to be breaking news would represent a crack in Facebook's spam defenses that would let through much more spam than legitimate everywhere-at-once stories, because those stories almost never occur, while spam happens every second of every minute of every hour of every day.
And yet, storm-of-the-century stories are incredibly important (by definition) and losing our ability to discuss them -- or having that ability compromised by having to wait hours for Facebook to discover, diagnose and repair the problem -- is a very high price to pay.
It's a problem with the same underlying mechanics as the incident in which a man was sent an image of his mother's grave decorated with dancing cartoon characters and party balloons on the anniversary of her funeral. Facebook sends you these annual reminders a year after you post an image that attracts a lot of "likes" and images that attract a lot of likes are far more likely to be happy news than they are to be your mother's tombstone. You only bury your mother once, while you celebrate personal victories repeatedly.
So cartoon characters on your mother's grave is a corner-case; an outlier, just like a spam filter suppressing a story about a breach of 50,000,000 Facebook accounts. But they are incredibly important outliers, outliers that the system should never, ever miss.
It may not ever be possible to design a system with two billion users that doesn't involve these kinds of outliers: a one-in-a-billion outlier in a system with two billion users will happen twice a day, on average. We don't really know how to design a system that can address the majority of cases and also every one-in-a-billion corner-case.
But the answer shouldn't be to shrug our shoulders and give up. If it's impossible to run a system for two billion users without committing grave, unforgivable sins on a daily basis, then we shouldn't have systems with two billion users.
Unfortunately, the rising chorus of calls for the platforms to filter their users are trapped in the idea that the platforms can fix their problems -- not that the platforms are the problems. Filtering for harassment will inevitably end up filtering out many discussions of harassment itself, in which survivors of harassment are telling their stories and getting support. Same goes for filtering for copyright infringement, libel, "extremist content" and other "bad speech" (including a lot of speech that I personally find distasteful and never want to see in my own online sessions).
It's totally true that filtering doesn't scale up to billion-user platforms -- which isn't to say that we should abandon our attempts to have civil and civilized online discussions, but that the problem may never be solved until we cut the platforms down to manageable scales.
When going to share the story to their news feed, some users, including members of the staff here at TechCrunch who were able to replicate the bug, were met with the following error message which prevented them from sharing the story.
According to the message, Facebook is flagging the stories as spam due to how widely they are being shared or as the message puts it, the system's observation that "a lot of people are posting the same content."
Facebook blocked users from posting some stories about its security breach [Taylor Hatmaker/Techcrunch]