clock menu more-arrow no yes

Facebook should crush fake news the way Google crushed spammy content farms

Mark Zuckerberg Attends Mobile World Congress
Facebook CEO Mark Zuckerberg
Photo by David Ramos/Getty Images

At this point, almost everyone acknowledges that fake news was a real problem in the 2016 election, and that Facebook ought to be doing something to combat the problem. Even Facebook itself — after a few weeks of downplaying the issue — now acknowledges that it has a responsibility for the stories distributed on the site.

“We had resisted having standards about whether something’s newsworthy because we did not consider ourselves a service that was predominantly for the distribution of news,” Facebook spokesperson Elliot Schrage said at a recent conference. “That was wrong.“

The big question is how Facebook should tackle the problem. It won’t be easy. Facebook has more than a billion users speaking dozens of languages in countries all over the world. Distinguishing fake news from real news — or low-quality news from high-quality reporting — is difficult. That’s especially true in a political environment where any misstep will be held up as evidence that Facebook has a partisan ax to grind.

In a recent blog post, publisher and technology visionary Tim O’Reilly makes a good suggestion: Facebook should study Google’s own experience trying to improve the quality of search results.

Six years ago, Google faced a problem a lot like the problem Facebook faced today: The web was being flooded with “webspam,” web pages with little useful content that were created solely to manipulate Google’s algorithm in order to generate traffic and ad revenue. Google’s successful response to this crisis tells us a lot about how Facebook can deal with today’s fake news epidemic.

Google used both human judgment and software to fight webspam

Google Plans To Go Public On The Market Photo by David Paul Morris/Getty Images

During the 2000s, people got better and better at gaming Google’s search algorithm. Some were running quasi-media companies whose writers churned out dozens of extremely short, poorly researched articles based on popular search terms. Others just straight up copied content created by other sites and slapped ads on them. These articles cluttered up search results and frustrated users, but they generated ad revenue for their owners, so the problem kept growing.

In a January 2011 blog post, Google search quality czar Matt Cutts acknowledged that Google had a big problem with these “content farms.”

“We hear the feedback from the web loud and clear: people are asking for even stronger action on content farms and sites that consist primarily of spammy or low-quality content,” he wrote.

Later that year, Google brought down the hammer, releasing changes to its search algorithm that caused traffic at major content farms to plummet.

In one sense, this was just the latest of many changes to Google’s algorithm. But it wasn’t just another technical tweak — it represented Google making a deliberate value judgment that some kinds of content were worse than other kinds.

Early versions of Google took a naively data-driven approach, assuming that a link from one site to another was a sign of quality. But as Google grew, people put more and more effort into gaming the system. To prevent search results from being overwhelmed by garbage, Google had to make a deliberate effort to fight against people who tried to cheat the system. Ultimately, it made a value judgment that low-quality content farms were bad content, and deliberately changed its algorithms to demote them in search results.

Facebook now faces a similar situation. When the site first started allowing users to share news stories, it didn’t need to worry too much about whether some news stories were made up. Because back then, there weren’t any fake news sites in the sense that we use that term today.

There has always been shoddy journalism and accidental mistakes in news articles, of course. But until Facebook’s newsfeed came along, there was no real incentive for someone to create sites full of false but clickable stories, since these sites would get very little traffic. Facebook’s growing popularity created a market opportunity for fake news sites just as Google’s growing popularity created a market opportunity for content farms.

So Facebook faces the same kind of choice Google did six years ago. It has to decide whether to deliberately intervene to demote low-quality content — content that Facebook unintentionally called into existence, in many cases — or whether to allow the newsfeed continue to be a free-for-all that’s increasingly dominated by the 2016 equivalent of content farms.

Facebook doesn’t have to choose between human editors and algorithms

Matt Cutts, Google’s search quality czar until 2014.
Reuben Yau

O’Reilly argues that the key lesson to draw from Google’s experience fighting webspam is that technology companies should make these kinds of decisions using algorithms, not manual human judgment. He praises Google for using metadata — characteristics of a page like how many other sites link to it — instead of human beings to figure out whether a particular page is high-quality or not.

But that’s actually a misleading description of Google’s strategy. It’s obviously true that it wouldn’t be practical for Google to have a human being look at every single page on the web. But Google actually uses both human and software review in its fight against webspam.

On its webspam page, Google says that “Google's algorithms can detect the vast majority of spam and demote it automatically. For the rest, we have teams who manually review sites.”

When a Google reviewer manually marks a site as spammy, a message is sent to the site’s owner explaining the reason. Webmasters have the opportunity to appeal these manual decisions.

Google includes human reviewers in the mix because algorithms inevitably make mistakes and manual human review is needed to keep the algorithms on the right track. Previously reviewed pages can be fed back into Google’s software, allowing the algorithms to learn from human judgment and get better over time.

So Facebook doesn’t have to choose between fighting fake news with algorithms or human editors. An effective fight against fake news is going to require heavy use of both approaches.

Facebook needs a team of people constantly reviewing frequently shared articles and rating them for quality. An article’s rating can not only help determine how widely that particular article circulates on Facebook but can also provide the raw material that helps Facebook tune its algorithms to more effectively detect fake news and other problematic content in the future.

A final lesson from the Google experience is that personnel decisions are important both operationally and for purposes of public relations. For a decade — until he took a sabbatical in 2014 — Matt Cutts was the public face of Google’s search quality efforts. He wrote blog posts that helped people across the web understand how to stay on the right side of Google’s anti-spam rules. He commented on Google’s policies in the media. And he generally helped reassure the public that Google took search quality seriously.

Facebook needs a Matt Cutts for news quality — someone who can not only lead Facebook’s fight against clearly fake news but also develop a strategy for shifting the newsfeed toward higher-quality news sources more generally. Ideally, this person would also be in close communication with the media industry, helping news organizations understand the rules of the newsfeed and how to ensure that their articles aren’t misclassified as fake or low-quality by Facebook’s algorithms.

Because this position inherently involves making editorial judgments — even more so than a search quality job at Google — it would be good for this person to have some experience in senior editorial jobs at a conventional media organization. Media experience would also allow Facebook’s news quality czar to be a more credible messenger to the news business.

And Facebook’s news quality czar needs to have a staff that includes both journalists to manually review some stories and programmers to write software to automatically detect fake and spammy news articles. Algorithms alone will be too easy to fool, while human review alone will never scale enough to process the vast number of articles Facebook users share every day.

Disclosure: My brother works at Google.