These LLMs are the best at resisting Russian propaganda
Estonian government benchmark shows how dozens of models combat Russia’s “strategic narratives.”
These LLMs are the best at resisting Russian propaganda The Estonian Language Institute has developed a benchmark to evaluate Large Language Models (LLMs) on their capacity to resist Russian propaganda, a concern for a nation with a history of Russian influence. Anthropic’s Claude models and several open-weight models demonstrated strong performance, while Google’s Gemini models showed particular weaknesses, especially when prompted in Russian. The benchmark highlights the evolving capabilities of LLMs in detecting and refusing to spread disinformation, with newer models generally outperforming older ones.
- The Estonian Language Institute (ELI) created a “Propaganda Resistance” benchmark to assess LLMs’ ability to avoid promoting Russian strategic narratives.
- The benchmark tests models on 14 categories of perceived Russian influence operations, using questions phrased neutrally, with false assumptions, or to elicit misinformation.
- Anthropic’s Claude models, particularly Opus 4.7, and open-weight models like Nvidia’s Nemotron and Alibaba’s Qwen, performed best.
- OpenAI’s GPT-5.4 also showed good results, while Google’s Gemini 2.5 Pro and 3.5 Flash were more susceptible to malicious and Russian-language prompts.
- Many models showed decreased resistance when questioned in Russian, with Google’s Gemini 3.5 Flash scoring significantly lower.
- There is an ongoing effort by governments, including Russia, to influence AI models with specific sociopolitical positions. Continue reading https://arstechnica.com/ai/2026/06/these-llms-are-the-best-at-resisting-russian-propaganda/
No comments yet.
Write a comment