Facebook’s internal “hate speech” guidelines appear to leave protected groups unprotected (June 2017)

Facebook’s global enforcement against hate speech results in seemingly harmful outcomes for potentially vulnerable groups.


Facebook has struggled to moderate “hate speech” over the years, resulting in it receiving steady criticism not only from users, but from government officials around the world. Part of this struggle is due to the nature of the term “hate speech” itself, which is often vaguely-defined. These definitions can vary from country to country, adding to the confusion and general difficulty of moderating user content.

Facebook content block notice

Facebook’s application of local laws to moderate “hate speech” has resulted in collateral damage and the silencing of voices that such laws are meant to protect. In the United States, there is no law against “hate speech,” but Facebook is still trying to limit the amount of abusive content on its site as advertisers flee and politicians continue to apply pressure.

Facebook moderators use a set of internal guidelines to determine what is or isn’t hate speech. Unfortunately for many users, the guidelines — which they never saw before ProPublica published them — result in some unexpected moderation decisions.

Users wondered why hate speech targeting Black children was allowed while similar speech targeting, for instance, white men wasn’t. The internal guidelines explained the factors considered by moderators, which led exactly to these seemingly-inexplicable content removals.

According to Facebook’s internal guidelines, these categories are “protected,” which means moderators will remove “hateful” content targeting anything on this list.

  • Sex
  • Race
  • Religious affiliation
  • Ethnicity
  • National origin
  • Sexual orientation
  • Gender identity
  • Serious disability/disease

And this is the list of categories not considered “protected” by Facebook:

  • Social class
  • Occupation
  • Continental origin
  • Political ideology
  • Appearance
  • Religions
  • Age
  • Countries

Critics pointed out the internal standards would seem to lead directly to harassment of groups supposedly protected (Black children), while shielding groups historically-viewed — at least in the United States — as not in any need of additional protections (white men).

This seemingly-incongruous outcome is due to the application of the rules by moderators. If a “protected” class is modified by an “unprotected” category (“Black” [race/protected] + “children” [age/unprotected]), the resulting combination is determined to be “unprotected.” In the case of white men, both categories are protected: race + sex. What seems to be a shielding of a pretty protected group (white men) is actually just the proper application of Facebook’s internal moderation guidelines

In response to criticism about outcomes like these, Facebook pointed out it operated globally. What might be considered a ridiculous (or even harmful) moderation decision here in the United States makes more sense in other areas of the world where white men might not make up a large percentage of the population or have historically held a great number of positions of power.

Decisions to be made by Facebook:

  • Should content be removed if it conveys hateful rhetoric against certain groups or individuals even if it doesn’t specifically violate the internal guidelines?
  • Should context be considered when moderating posts that violate the internal guidelines to ensure users who are spreading awareness/criticizing other users’ hateful speech aren’t subjected to the same moderation efforts or account limitations?
  • Which first principles should Facebook be operating on when creating anti-hate policies, and are these policies holding up those principles in practice?

Questions and policy implications to consider:

  • When moderating hate speech, should more discretion be used by moderators to ensure better protection of marginalized groups?
  • Would altering or expanding the scope of the internal guidelines result in users switching to other social media services?
  • Do seemingly inconsistent internal rules (i.e., moderation that protects white men while leaving Black children open to abuse) confuse users and/or result in loss of advertising revenue?


Facebook moderators continue to use lists like these to make decisions about perceived “hate speech.” The company continues to consider all stakeholders, including foreign governments who have passed “hate speech” laws that surpass what the site’s internal guidelines already target for removal.

Written by The Copia Institute, August 2020Written by…

Copia logo