How ToxMod’s AI impacted toxicity in Call of Duty voice chat

It’s no secret Call of Duty has toxic players. You can hear them trash talk just about anytime you turn on voice chat in the game. But Modulate teamed up with Activision to use AI voice moderation to address the problem, and the results were worth shouting about.

The companies noted that toxicity exposure was reduced 50% in voice chat in both Call of Duty: Modern Warfare II multiplayer and Call of Duty: Warzone in North America. And in the newest game, Call of Duty: Modern Warfare III, ToxMod found that (on a global basis, excluding Asia) there was an 8% reduction in repeat offenders month-over-month and a 25% reduction in exposure to toxicity.

On top of that, Activision confirmed that retention of players improved, as did the overall experience for gamers in online multiplayer play. I interviewed Modulate AI CEO Mike Pappas about it and looked at the results in a case study on the use of ToxMod in real time on Call of Duty. Pappas has been anxiously awaiting the day when he could talk about these results.

“There are not many studios that have given this kind of transparency and be really active to work with us to get this story out there. And we’ve already seen a lot of positive reception to it,” Pappas said.

Call of Duty has been the titan of first-person shooter action games for two decades, with more than 425 million copies sold as of October 2023. But its popularity means that it draws all types, and some of them aren’t so nice when either cheating or chatting verbally in Call of Duty multiplayer games.

To address the cheating, Activision launched its Ricochet anti-cheat initiative. And to combat toxic voice chat, it teamed with Modulate on implementing ToxMod’s AI screening technology. The testing for the case study took place during recent game launches. It covered two different periods, including the launch of Call of Duty: Modern Warfare II as well as the launch of Call of Duty: Modern Warfare III and a coinciding season of Call of Duty: Warzone.

“This has driven sort of a new upsurge of additional interest from gaming, and frankly, from some industries beyond gaming as well that are recognizing what we’re doing here is on the very cutting edge,” Pappas said.

The aim has been to work with the gaming safety coalition, moderators and others on how to combine AI and human intelligence into better moderation and safety.

ToxMod’s integration into Call of Duty

ToxMod is specifically designed to address the unique challenges of moderating in-game voice communication. By leveraging machine learning tuned with real gaming data, ToxMod can tell the difference between competitive banter and genuine harassment, Pappas said.

While the primary focus of the analysis was to understand and improve player experience, working closely with the Call of Duty team and complementing additional related efforts, Modulate was able toanalyze the impact the introduction of voice moderation was having on player engagement, and foundsizable positive effects.

In the case of Call of Duty: Modern Warfare III (globally excluding Asia), Activision was able to act on two million accounts that disrupted games by violating the Call of Duty Code of Conduct in voice chat, Modulate said.

ToxMod identified rates of toxicity and toxicity exposure in voice chats well above the rates that existing player reports alone identified. Player churn was reduced when ToxMod was enabled.

Thanks to the additional offenses identified by ToxMod, Activision was better able to take action against offenders – which in turn led to an increase in player engagement. ToxMod found that only about 23% of player-generated reports contained actionable evidence of a Code of Conduct violation.

I play a lot of Call of Duty every year and I’m at level 167 in multiplayer this year (I haven’t played as much as usual). That’s equivalent to about 33 hours of multiplayer alone. During the pandemic, I really enjoyed chatting with three other friends while in Warzone matches.

I still find players who leave the voice chat on and play loud music or some kind of sermon. But it seems like voice chat has gotten cleaner. As Modulate says, voice chat-enabled games, in particular, have taken the player experience to a whole new level, adding a more human and more immersive layer to gameplay, fostering a greater sense of community across the globe.

But it’s easy to ruin that.

Games like Call of Duty are popular because they foster connection, competition, skill and fun. Prior tothe official launch of ToxMod in Call of Duty, an ADL report found that 77% of adult video game playershad experienced some form of severe harassment — and Call of Duty is most definitely not immune. Andwith a fanbase of this size, moderating that toxicity presents unique challenges.

A 2022 ADL report found that 77% of adult video game players had experienced some form of severe harassment.

How ToxMod works

Modulate’s ToxMod aims to reduce players’ exposure to harmful content through proactive, ML-driven voice moderation, thereby contributing toward improving player engagement and retention.

ToxMod allows moderation teams to deploy advanced, complementary efforts on often-unactionable, player-generated reports that lead toward a more proactive moderation strategy — a pivotal move in the ongoing battle against in-game toxicity.

“We have validated statistics here on user report coverage compared to proactive detection, as well as the impact on player engagement,” Pappas said. “These are probably the two types of statistics that we were most excited to have. There are profound things to show here.”

Pappas said the majority of the toxicity fell into racial or sexual harassment. Dropping the occasional F-bomb is not what the toxicity AI is tuned for. Rather, it focuses on Activision’s Code of Conduct and its expectations of user behavior. Simply using the F-bomb doesn’t count as toxic behavior. But if you use it while throwing racial slurs at someone, that could be a violation based on hate speech.

“We’re specifically looking for those more egregious things that graduate from just somewhat vulgar extreme language to really directed hostility,” Pappas said. “It’s based on the severity of how egregious the behavior is.”

Activision itself provided Modulate with guidelines on what to look for. And the companies wanted to combine the AI detection with human moderators. Much of the drudge work an be done by AI at a speed that can’t possibly be matched by humans. But humans can make the better judgment calls.

Since ToxMod detects conversations in real time and flags them, it can give the developers data on toxic behavior they weren’t even aware about.

“They now have visibility, which allows them to moderate,” Pappas said. “They can get a deeper understanding of when and why toxicity happens in the ecosystem.”

The other big takeaway here is that users actually genuinely have better experience after the moderation, Pappas said.

“More players came back into the ecosystem,” Pappas said. “That’s directly as a consequence of it being more pleasant to stick around and play longer because they’re having fun, and they’re not being harassed or terrorized in any way.”

What’s the Problem?

Toxic behavior, ranging from derogatory remarks to harassment, not only tarnishes individual gameplay experiences, but also can erode the sense of camaraderie and respect that underpins healthy gaming communities.

The impact of such behavior extends beyond momentary discomfort; it can lead to players taking a step away from the game for a few hours, days, or even quitting altogether (also known as player churn) and diminished community engagement. As Activision continued to fulfill its initiatives to support Call of Duty’s player community, the teams at Activision and Modulate developed a hypothesis: Shifting toward proactive voice moderation via ToxMod would materially improve player experience, while materially reducing toxicity exposure rates.

Next, it was time to put that hypothesis to the test by integrating ToxMod.

ToxMod’s integration into Call of Duty

Recognizing the limitations of traditional moderation methods and the unique challenges presented by real-time voice communication, the decision to adopt ToxMod was driven by Activision’s commitment tomaintaining a positive and inclusive gaming environment for the Call of Duty community.

This partnership ensured that ToxMod’s advanced voice moderation capabilities were seamlessly woven into the existing game infrastructure, with minimal impact on game performance and user experience.

Key considerations included: careful tuning to adhere to Activision’s Call of Duty Code of Conduct, preserving the competitive and fast-paced spirit of gameplay, compatibility with the game’s diverse gameplay modes, adherence to privacy standards and privacy laws, scalability to accommodate the massive Call of Duty player base, and maintaining lowest possible latency for toxicity detection.

How ToxMod works within Call of Duty

ToxMod operates within Call of Duty through a sophisticated, multi-stage process designed to proactively identify and prioritize toxic voice chat interactions for Activision’s human moderator team.

ToxMod is also designed to respect player privacy. To that end, ToxMod is designed to recognize speech, but ToxMod does not engage in speaker identification, and does not create a biometric voiceprint of anyuser. This process can be broken down into three phases:

Triage

In the first stage, ToxMod analyzes voice communications in real-time, looking for toxic speech as defined by Call of Duty’s Code of Conduct. This initial filtering allows ToxMod to determine which conversations warrant greater attention and is crucial for efficiently identifying conversations that warrant closer examination, ensuring that the system remains focused on the most likely problematic interactions.

Analyze

Interactions flagged in the triage stage then undergo a deeper analysis to understand context and intention. It evaluates nuances: slang, tone of voice, cultural references, and the conversation between players. By doing so, ToxMod can distinguish between competitive banter, which is a natural part of the gaming experience, and genuinely harmful content. With this information, ToxMod can better uncoverkey context of a voice interaction so a moderator can determine the next course of action.

ToxMod focuses on phrases or slurs which are unequivocally bad and undergoes the following types of analysis: Recognizing emotions, including anger, which can help differentiate between the banter typical (and welcome!) in Call of Duty and genuine hurt or aggression.

It also performs sentiment analysis. ToxMod analyzes the full utterance in context of the broader conversation (both before and after the utterance itself) to better understand the intent and sentiment with which it was spoken.

Escalate

After ToxMod prioritizes and analyzes a voice chat interaction that is very likely a violation of Call of Duty’s Code of Conduct, the issue is escalated to Activision for review. Rather than funneling all voice chat interactions to moderators, this tiered approach ensures that potential false positives are removed from the moderation flow. Moderator actions can range from issuing warnings to temporary or permanentcommunication bans, depending on the severity of the offense.

Initial analysis results

ToxMod’s impact was initially assessed within North America for English-speaking Modern Warfare II andCall of Duty: Warzone players. This initial analysis allowed Activision teams to gather initial insights into the scale and type of behavior happening in voice chats and to fine-tune ToxMod’s detection specificallyfor the Call of Duty player base. Activision tested manual moderation actioning based on ToxMod’sdetection on a treatment group and maintained a control group where ToxMod would still detect likelyCode of Conduct violations, but no moderator action would be taken.

Toxicity exposure

In the control group, ToxMod’s data showed at least 25% of the Modern Warfare II player base was exposed to severe gender/sexual harassment (~90% of detected offenses) and racial/cultural harassment (~10% of detected offenses). Where was toxicity coming from?

Among all voice chat infractions in the treatment group, ToxMod data shows that about 50% of infractions were from first-time offenders. Analysis showed that of the total warnings issued to players forfirst-time detected offenses, the vast majority were issued to players who were already active in Call ofDuty– that is to say, players who are already regularly playing Call of Duty titles. Only ~10% of first-timeoffense warnings were issued to new players or players returning to Call of Duty after some time.

During this analysis period, Activision adopted a three-tiered enforcement flow, with a 48-hour cooldown before players could be escalated into the next enforcement tier: 2.1% of first-time offense warnings were given to new players of Call of Duty.

For tier one violators, the player is sent a warning about their voice chat behavior violating the Call of Duty Code of Conduct. For the elevated tier two violators, the player is muted for three days and notified. And for tier three violations, the player is muted for 14 days and notified. About 4.7% of first-time offense warnings were given to lapsed players who returned to Call of Duty after 21 to 59 days absence. 1.7% of first-time offense warnings were given to players who returned to Call of Duty after 60 or more days absence. And 19% of toxicity exposure was due to players violating the Code of Conduct while in a cooldown period following a moderator warning.

About 22% of toxicity exposure was due to players violating the Code of Conduct after a moderator penalty had been lifted. Within the repeat offenses, 13% of those offenses occurred after tier-1 warning, 7% after tier-2 shadow mute for 3 days and notified, 2% after a tier-3 shadow mute for 14 days and notified.

In periodic tests comparing exposure to toxicity in the treatment group and the control group, ToxMod was consistently found to reduce toxicity exposure between 25% to 33%.

Reactive player reports

Modulate and Activision also looked at the efficacy of reactive moderation in the form of player- generated reports. Data showed that reactive moderation approaches like player-generated reports addressed only a small fraction of the violations.

For example, on average, approximately 79% of players violating the Code of Conduct and escalated by ToxMod each day have no associated player reports – these offenders might not ever have been foundwithout ToxMod’s proactive detection.

Approximately 50% of player reports submitted had no associated audio from reported players in voice chat 24 hours before the report was made.

Of the reports with associated audio, only an estimated 50% of them will contain a Code of Conduct violation – this suggests that only about one quarter of player reports contained actionable evidenceof toxicity in voice chat.

Player engagement

Modulate and Activision also analyzed the impact of proactive voice moderation on player engagement.Proactive moderator actioning against Code of Conduct violations in the treatment group boosted theoverall number of active players in the treatment group.

Comparing the treatment group to the controlgroup in Modern Warfare II, the treatment group saw 3.9% more new players, 2.4% more players who were previously inactive for 21 to 59 days, and 2.8% more active players who were previously inactive for 60 or more days.

Notably, the longer moderation efforts went on, the larger the positive impact and more playersremaining active in the game. Modulate and Activision teams compared the total number of activeplayers in the treatment group to the control group after three days, seven days and 21 days from the start of the testing period and found the treatment group

There were 6.3% more active players on day three, 21.2% more players active on day seven, and 27.9% more active players on day 21.

Global launch results

Using ToxMod data, Activision was able to report on the results of proactive moderation in Call of Duty:Modern Warfare III following the game’s launch in November 2023 in all regions across the globe exceptAsia. The key findings included:

A more potent reduction to toxic voice chat exposure.

Call of Duty saw a ~50% reduction in players exposed to severe instances of disruptive voice chat since Modern Warfare III’s launch. This decrease highlights the progress being made by Activision and Modulate since the trial period. Not only does it show that players are having a much better time online, it also speaks to improvements in overall player engagement.

A decrease in repeat offenders

ToxMod’s ability to identify and help moderators take action against toxic players led to an 8% reduction in repeat offenders month over month, contributing to a healthier community dynamic.

This 8% reduction in repeat offenders in Modern Warfare III shows that as ToxMod continues to run, more and more players recognize the ways in which their actions violate the Code of Conduct, and learn to adapt their behavior to something less exclusionary or offensive.

An increase in moderator enforcement of the Call of Duty Code of Conduct

More than two million accounts have seen in-game enforcement for disruptive voice chat, based on theCall of Duty Code of Conduct between August and November 2023.

Of the severe toxicity that ToxMod flagged, only one in five were also reported by players, meaning that ToxMod enabled Activision to catch, and ultimately put a stop to, five times more harmful content without putting any extra burden on Call of Duty players themselves to submit a report.

Conclusion

The integration of ToxMod into the most popular video game franchise in the world represents a significant step in Activision’s ongoing efforts to reduce toxicity in Call of Duty titles. Beyond Call of Duty, Activision’s strong stance against toxicity demonstrates what is possible for other game franchises across the globe, redefining in-game communication standards and setting a new benchmark for proactive moderation in the multiplayer gaming industry.

By prioritizing real-time intervention and fostering a culture of respect and inclusivity, Call of Duty is not only enhancing the gaming experience for its players but also leading by example in the broader gaming industry.

Pappas said Modulate has been releasing its case study results and it has gotten a lot of inbound interest from other game studios, researchers and even industry regulators who pay attention to toxicity.

“This is really exciting. It’s so gratifying to have really concrete evidence that trust and safety not only is good for the player, but it also benefits the studio. It’s a win win win. Everyone’s really happy to have firmer evidence than has existed about that before.”

He said folks are also happy that Activision is sharing this information with other companies in the game industry.

“Players have been asking for a long time for for improvements in this space. And this case study demonstrated that it’s not just a small contingent of them, but it’s really the whole broad player ecosystem. People who are diehard fans of games, like Call of Duty, are genuinely grateful and are coming back and spending more time playing the game,” Pappas said.

The post How ToxMod’s AI impacted toxicity in Call of Duty voice chat | case study appeared first on Venture Beat.