'Teach Me' Interview Series on Content Moderation

Thu, 04/15/2021

I’m Neil Rutledge, a Research Associate at the Nebraska Governance and Technology Center. Part of my work is engaging with the brilliant faculty we have here at the center and sharing their work and expertise with other scholars and the public.

As part of the Teach Me series, I sit down with NGTC scholars to ask them what they have been thinking about and engaging with in the arena of law and technology. In this installment of the Teach Me series, First Amendment scholar Professor Kyle Langvardt and I delve into the thornbush that is content moderation on social media platforms. 


There has been an enormous amount of discussion lately about content moderation, and what some people are calling “censorship,” of certain types of content on social media platforms. Just to establish a baseline, what do we mean when we talking about content moderation?

First off, people don’t always use the term “content moderation” in the same way.  But what it usually means is online content governance—governance of online speech, at scale, undertaken by administrators at a private company.


So in that sense, is content moderation something that is new and unique to social media platforms?

No. Media companies have always set rules about what’s fit to print, and they have always had procedures for enforcing those rules.  Newspaper editors enforce the rules that are set down in a newspaper’s editorial guide-sheet.  Broadcasters draw up lists of words they can’t say on the air and things they can’t show, and censors review content in advance before it airs to make sure it doesn’t violate those rules.  This is why, for example, there is a seven-second delay on live events. 

The big difference on an online platform is scale.  Take YouTube for example. In May 2019, YouTube stated that 500 hours of video were being uploaded to YouTube every minute. So given that stat, let’s say you want to screen out inappropriate sexual content.  You can draw some clear lines here about certain types of nudity and so on, but there’s also going to be a lot of gray area.  At an old media organization, you could deal with the gray area by assigning the work to a human censor or editor, and you could hopefully trust them, based on past employment history, to make judgment calls based on accumulated wisdom.  An important distinction here, as compared to social media platforms, is that there has never been any expectation that legacy media would run a given letter to the editor or publish other types of content. That’s not the case with social media platforms, where users often feel they have a right to publish their content on a platform, and sometimes feel their rights have been abridged if a platform decides that they don’t want a user’s content to appear on site. In comparison, with regard to newspapers publishing letters to the editor and so on—you could err on the side of caution without really rocking any boats.


Are there other important factors that set online platforms apart from traditional media?

The decision-making process is more mechanical. Say that you’re Facebook and you want to screen out inappropriate sexual content.  There are too many decisions, obviously, to assign to any small group of editors and censors.  Instead, you’ll want to automate as many of the easy cases as you can, and then assign the harder cases to human beings.  Even after automation, you’re likely to have enough tough cases left over that you will need thousands of human moderators. Facebook has about 15,000 human moderators, and some reports say they average about one determination every 150 seconds.  YouTube and Google have 10,000 moderators; Twitter has 1,500.  So it’s an industrial operation.  The old broadcast censor was an artisan; an online platform’s content moderation operation is a factory, and often an outsourced one.

When you scale up from a small workshop to a factory, standardization becomes super-important.  Rather than trusting a few people to do high-quality work according to an implicit standard, you have to be as explicit and granular as possible in describing how the work should be done.  This kind of standardization is the only way you’re going to keep thousands of laborers turning out a consistent product.  And because the set of rules that govern the whole operation will tend to be pretty elaborate, you will probably want to divide the workforce up into specialty groups assigned to different parts of the job.  It’s the only way to maintain quality control and efficiency simultaneously.

This means a content moderator might be assigned to evaluate a certain type of content – potential hate speech, for example – and to check off the appropriate boxes for each piece of content that comes down the pike.  On a big platform, you need to apply this kind of review to potentially billions of pieces of content, and you want to remove the most damaging stuff as early as possible.  So there are a range of techniques that can be used in combination.


Can you give us a few examples of some of those techniques?

Sure. First, there’s a lot of preemptive moderation, which means some content gets intercepted before it even posts.  There are big databases that identify known pieces of terrorist material, copyrighted content, and child sexual abuse material.  If someone gets on a major platform and tries to post a piece of content that’s in the database, the post is probably dead on arrival.  And on some platforms, a human moderator might pre-moderate comments.  Automated and human moderators can work together, too.  For example, an algorithm might detect potentially violent language in a post, and send it to a team of human moderators who screen it before it goes up.

Pre-moderation by human beings can really slow a discussion down.  So in many instances a forum or platform will moderate content after the fact.  And there are a couple of ways to go about this.  First, you could have a roving moderation system where moderators review everything at some point after it goes up.  But employing all those moderators can get pretty expensive.  So a lot of platforms rely instead on users to report or “flag” material that should potentially be removed – that is, if the AI moderation system hasn’t already flagged it.  Human moderators then review all the flagged content and decide whether to take action.


I take your point that employing all those content moderators could be prohibitively costly. Are there other ways that platforms have found to moderate content that keep costs down?

Absolutely.  One really simple way to do this is through upvoting and downvoting, like on Reddit.  Ideally, the more reliable or trustworthy content will make it to the top—though that’s not always how it works in practice.  Another way that Reddit relies on users to moderate content is by allowing volunteer moderators to set specific rules user communities. This is what is known as a subreddit.  Different subreddits might have stricter or more lenient rules for profanity or nudity, for example.  This user-driven approach frees up Reddit to focus its moderators on absolute rules that apply across the platform, such as their prohibitions on incitement or doxxing. Doxxxing, by the way, is where you gather a bunch of publicly available information about a user and publish it for intimidation. This information could include things like home address, the school the kids attend, that kind of stuff.


What options do platforms have when they identify content that they find objectionable?

Platforms have responded in a few different ways. The most common approach is just to remove the content. It’s also pretty common to suspend or terminate the account of the user who posted it, as well.  But there are some subtler options, too.  In some cases, platforms have attached “fact checks” to suspect content, and in others, platforms have allowed content to stand while limiting its distribution.  People can still see it if they seek it out, but it’s not likely to show up in too many people’s news feeds.


So this is what it means to “moderate” content.  One final question, then.  Why don’t we just use the word “censor?”  Isn’t that more direct?

One answer is that the word “censorship” has an ugly connotation, in that people associate it with repressive political regimes. But the word doesn’t always have that connotation. When a TV station bleeps out a swear word, no one shies away from saying that the “network censors” did it.  Nothing dystopian about it. 


So why is it “content moderation” when the same thing happens online?

Here’s a theory: people who study this stuff know that content moderation is not an evil practice.  We need it to keep the internet usable, and often to protect people’s safety.  But what that means is that as a society, we now depend on an industrialized censorship operation the scale of which the world has never seen.  That’s a deeply uncomfortable position to be in – and coping with that kind of discomfort is what euphemisms are for.


Kyle Langvardt is a First Amendment scholar who focuses on the Internet’s implications for free expression both as a matter of constitutional doctrine and as a practical reality. His written work addresses new and confounding policy issues including tech addiction, the collapse of traditional gatekeepers in online media and 3D-printable weapons. Professor Langvardt’s most recent papers appear in the Georgetown Law Journal, the Fordham Law Review and the George Mason Law Review.


Tags: Weekly Roundup

Teach Me Logo with the word Online Content Moderation underneath