Facebook’s parent company declined to answer our questions about how it moderates content in VR, so we created a test Horizon World filled with content banned from Facebook and Instagram. Content moderators said the world was fine — until we told Meta’s PR team about it.
Facebook said it would be different this time.
Announcing the company’s rebranding to Meta, CEO Mark Zuckerberg promised that the virtual worlds it believes to be the future of the internet would be protected from the malignancies that have plagued Facebook. “Privacy and safety need to be built into the metaverse from Day 1,” he said. “This is about designing for safety and privacy and inclusion before the products even exist.
In some respects, it will be different this time because virtual reality is a radically different medium from Facebook or Instagram. But although the company’s virtual worlds are already available for users to create and explore, Meta has kept secret much of how it plans to enforce its safety protocols in VR, declining to answer detailed questions about them. (Disclosure: In a previous life, I held policy positions at Facebook and Spotify.)
Transparency around these rules is important because Meta has long struggled with how to moderate content on Facebook and Instagram. It has invested billions in machine learning tools to moderate content at scale, and it has confronted hard problems about what speech should be allowed. But content moderation will likely be more challenging in VR than on social platforms, not least because the tools for those older platforms do not easily transfer over into a medium that requires a real-time understanding not only of content but also how people behave. There is also a trade-off between privacy and safety at stake: even if the company could track every conversation and interaction we had in VR, would we want it to?
Meta has said it recognizes this trade-off and has pledged to be transparent about its decision-making. So, to better understand how it is approaching VR moderation, BuzzFeed News sent Meta a list of 19 detailed questions about how it protects people from child abuse, harassment, misinformation, and other harms in virtual reality. The company declined to answer any of them. Instead, Meta spokesperson Johanna Peace provided BuzzFeed News a short statement: “We’re focused on giving people more control over their VR experiences through safety tools like the ability to report and block others. We’re also providing developers with further tools to moderate the experiences they create, and we’re still exploring the best use of AI for moderation in VR. We remain guided by our Responsible Innovation Principles to ensure privacy, security and safety are built into these experiences from the start.
The very first entry in “Responsible Innovation Principles” enshrines the value of transparency: “We communicate clearly and candidly so people can understand the tradeoffs we considered, and make informed decisions about whether and how to use our products.”
We went back and asked again for Meta to consider our questions. The company declined.
So, to find out what we could on our own, we strapped on some Oculus headsets, opened Horizon Worlds, and ran a rudimentary experiment.
Time and time again, Meta has removed and taken action on pages and groups, even private ones, that use these phrases.
In a matter of hours, we built a private Horizon World festooned with massive misinformation slogans: “Stop the Steal!” “Stop the Plandemic!” “Trump won the 2020 election!” We called the world “The Qniverse,” and we gave it a soundtrack: an endless loop of Infowars founder Alex Jones calling Joe Biden a pedophile and claiming the election was rigged by reptilian overlords. We filled the skies with words and phrases that Meta has explicitly promised to remove from Facebook and Instagram — “vaccines cause autism,” “COVID is a hoax,” and the QAnon slogan “where we go one we go all.” Time and time again, Meta has removed and taken action on pages and groups, even private ones, that use these phrases.
We did not release this toxic material to the larger public. Only a handful of BuzzFeed News reporters were given access to the Qniverse, which was created using an account in the real name of a BuzzFeed News reporter and linked to her Facebook account. We kept the world “unpublished” — i.e., invitation only — to prevent unsuspecting users from happening upon it, and to mimic the way some Meta users seeking to share misinformation might actually do so: in private, invitation-only spaces.
The purpose of our test was to assess whether the content moderation systems that operate on Facebook and Instagram also operate on Horizon. At least in our case, it appears they did not.
After over 36 hours, the Qniverse appeared to go undetected by Horizon, perhaps because the world was private, had only four people allowed to enter it, and got almost no engagement — all factors that would likely make it a low priority for content moderators.
Using Horizon’s user reporting function, a BuzzFeed News employee with access to the world used his own name and a linked Facebook account to flag the world to Meta. After more than 48 hours and no action, the employee reported the world again, followed quickly by another report from a different BuzzFeed News user with access to the world who also used her real name, which was linked to her Facebook and Oculus profiles.
“Our trained safety specialist reviewed your report and determined that the content in the Qniverse doesn’t violate our Content in VR Policy.”
Roughly four hours after the third report was filed, the employee who submitted it received a response from Meta: “Our trained safety specialist reviewed your report and determined that the content in the Qniverse doesn’t violate our Content in VR Policy.” Six hours after that, the original reporter received the same message. Perhaps the moderators left the Qniverse up because the world contained only violative content, and not violating behavior. Beyond the act of creating the misinformation slogans, we did not speak or otherwise interact with the content in the world. Without that context, maybe content moderators took it to be a parody.
We went to Meta’s comms department, a channel not available to ordinary people. We asked about its content moderators’ decisions: How could a world that shares misinformation that Meta has removed from its other platforms, under the same Community Guidelines, not violate Horizon’s policies?
The following afternoon, the experimental world disappeared. The company had reversed its original ruling.
“After further review, we have removed this from Horizon Worlds,” spokesperson Joe Osborne said. He declined to answer further questions about the decision.
Meta faces a task that is daunting, maybe even impossible.
Any news consumer over the past several years has seen this before: journalists calling out content on social platforms, especially those owned by Meta. That’s partly because, at times, things got really bad. Facebook evolved from a place where you poked your old high school friend to a place where your old high school friend could add you to a private group that grew to thousands of other people who talked about shooting protesters or overthrowing the democratically elected government. Horizon is a medium that is still very new. As Meta builds it, the company has promised to tackle again a complex conundrum that it and every other social media company have struggled with: balancing the competing interests of keeping its platform safe while allowing for free expression. Given the unique challenges of this new medium, Meta faces a task that is daunting, maybe even impossible.
Today, Meta appears to rely mainly on user blocks, mutes, and reports to notify it of Community Standards violations in VR. Andrew Bosworth, Meta’s chief technology officer, explained the reason why in a November 2021 blog post: “We can’t record everything that happens in VR indefinitely — it would be a violation of people’s privacy.” He then explained that Oculus devices do record (and then record over) users’ most recent experiences in VR, but that those recordings are only sent to Meta if a user files an abuse report.
This approach departs sharply from the one Meta has taken on Facebook and Instagram. Callum Hood, head of research for the Center for Countering Digital Hate, worries that without some proactive moderation system in place, users could form toxic communities around harmful content like racism and child abuse that would never be reported. “We’ve seen this before with Facebook Groups,” he said.
“We’ve seen this before with Facebook Groups,” he said.
But the alternative raises concerns, too. Rory Mir, a grassroots advocacy organizer at the Electronic Frontier Foundation, cautioned that if Meta does use machine learning models to police its VR users’ behavior, it might “censor more people than it’s going to protect,” because it will require “far more intimate forms of surveillance: not just what you’re writing, but how you’re behaving.”
Since Horizon Worlds launched in December, Meta has yet to run into large-scale moderation problems like it has on Facebook, perhaps in part because Horizon has a tiny fraction of the users. But there have been problems: Women and people of color in Horizon have reported being harassed, abused, and targeted with hate speech — something VR has struggled with for its entire history. Horizon has a “safe zone” option that allows users to quickly exit a virtual space and block other users, and it just launched “personal boundaries” as another mechanism against harassment. Still, a reporter at the Washington Post recently encountered numerous apparent children in Horizon, which officially requires that users be at least 18, raising concerns that children could be targeted by predators in the app.
Many of Facebook and Instagram’s biggest content policy issues have been about how Meta’s algorithms promote and amplify content. Meta does promote public worlds in Horizon, and it promotes apps and other experiences for Oculus users — but it’s not clear how it does so and whether its algorithms can propel worlds in the same way as they boosted groups, some of which advocated extremism or violence. Meta declined to answer questions about whether its Recommendations Guidelines and Content Distribution Guidelines apply in VR.
Meta’s pivot to VR comes at a difficult time for the company. Last week, Facebook reported its first-ever decline in daily active users, sending Meta stock into a nosedive that lost the company approximately 25% of its worth in a day. The drop ramps up the pressure on VR to be the next big growth driver for a company that has pursued expansion at almost any cost. (In 2016, Bosworth, then the vice president of augmented and virtual reality, argued that Facebook should prioritize growth over curbing real-world harms like terrorist attacks because “anything that allows us to connect more people more often is *de facto* good.” He said later he was being provocative to force a debate and that the company never held that policy.
According to Brittan Heller, an expert in virtual reality and human rights, technology to adequately moderate in VR doesn’t exist yet: Companies cannot reliably transcribe and assess users’ speech in real time, nor can they reliably recognize gestures. One can imagine an alternate Qniverse — one without words scrawled on its walls, but where users nonetheless congregate to share harmful misinformation. Without recording everything users say in VR, how can Meta know whether such a situation is happening? But recording everything users say and do, even in private groups, raises stark privacy questions.
Another new challenge is the complexity added by perspective: An action (or an object) will look different from every viewpoint. “Whose angle,” Heller asks, “should receive the most weight” in a moderation dispute?
Meta declined to answer far more basic questions, including whether VR content is subject to fact-checking, whether the company is capable of detecting praise and support of terrorist organizations in Horizon, and whether the company can prevent users in Horizon from consuming white supremacist content or porn. Meta also refused to say whether the machine learning models that do the vast majority of content moderation on Facebook and Instagram also play a role in moderating content on Horizon. Meta declined to tell us whether files uploaded to Horizon — like those uploaded to the company’s social platforms — are checked for child exploitation and terrorist material against hashed image banks maintained by organizations like the National Center for Missing and Exploited Children and the Global Internet Forum to Counter Terrorism.
“If we’re trying to make a metaverse for all people, we’d better be ready for all people.”
Meta’s failure to detect our experiment troubled some experts. Paul Barrett, deputy director of the NYU Stern Center for Business and Human Rights, noted that people are always looking for new ways to share content while evading detection. Opportunistic users will always find and exploit the platform’s weakest points of enforcement, he said, and the company “ha[s] a responsibility to figure out beforehand how this new service could be misused.”
The risks posed by Meta’s VR moderation will ultimately depend on how many people use the company’s VR products. The Oculus app, meant for use with Meta’s VR headsets, was downloaded roughly 2 million times in the two weeks after this past Christmas. But unlike Facebook and Instagram, which billions of people worldwide use free of charge, Horizon requires users to spend hundreds of dollars on a headset and strap it to their face — hardly something that can be done in a grocery store checkout line or between tasks at work. According to Emerson Brooking, an expert on the weaponization of social media, the pool of users in Horizon may be small enough, for now, that Meta can effectively moderate it by having human employees monitor everything that users flag. But Heller nonetheless worried about extremists in VR, cautioning: “If we’re trying to make a metaverse for all people, we’d better be ready for all people.” ●