Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-Translation Solution
Abstract
STEAM, a back-translation-based detection method, enhances multilingual watermarking robustness across various languages by addressing semantic clustering failures.
Multilingual watermarking aims to make large language model (LLM) outputs traceable across languages, yet current methods still fall short. Despite claims of cross-lingual robustness, they are evaluated only on high-resource languages. We show that existing multilingual watermarking methods are not truly multilingual: they fail to remain robust under translation attacks in medium- and low-resource languages. We trace this failure to semantic clustering, which fails when the tokenizer vocabulary contains too few full-word tokens for a given language. To address this, we introduce STEAM, a back-translation-based detection method that restores watermark strength lost through translation. STEAM is compatible with any watermarking method, robust across different tokenizers and languages, non-invasive, and easily extendable to new languages. With average gains of +0.19 AUC and +40%p TPR@1% on 17 languages, STEAM provides a simple and robust path toward fairer watermarking across diverse languages.
Community
Some watermarking methods for large language models (LLMs) claim to be multilingual, yet they are almost always tested on high-resource languages like English, French, and German. This paper reveals that such claims do not hold up under scrutiny: multilingual watermarks collapse under translation attacks in medium- and low-resource languages.
This paper traces the issue to semantic clustering, the main technique behind multilingual watermarking, which groups semantically similar tokens across languages. When tokenisers have few full-word tokens, as is common in less-resourced languages, the clustering fails, weakening watermark detection.
To fix this, the paper introduces STEAM (Simple Translation-Enhanced Approach for Multilingual watermarking), a lightweight, detection-time method that restores watermark signals lost during translation. STEAM uses back-translation, translating a suspect text back into multiple supported languages, and then identifies the strongest watermark signal across these variants. Crucially, STEAM is model-agnostic, tokenizer-independent, and works with any existing watermarking method without modifying model outputs.
๐ Results
- Evaluated on 17 languages spanning high-, medium-, and low-resource groups.
- Achieves +0.19 AUC and +40%p TPR@1%FPR average improvement over prior multilingual methods.
- Outperforms semantic clustering (X-SIR, X-KGW) by up to +0.33 AUC and +64.5%p TPR@1%.
- Remains robust under translator mismatches and adaptive multi-step translation attacks.
๐ก Key Insight:
Watermark robustness depends on language tokenisation coverage. By recovering lost watermark signals through back-translation, STEAM ensures fairer, more reliable detection across linguistic diversity.
In short, STEAM redefines multilingual watermarking as genuinely multilingual, offering a simple yet powerful step toward equitable content provenance for all languages.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Robustness Assessment and Enhancement of Text Watermarking for Google's SynthID (2025)
- SimKey: A Semantically Aware Key Module for Watermarking Language Models (2025)
- An Ensemble Framework for Unbiased Language Model Watermarking (2025)
- Analyzing and Evaluating Unbiased Language Model Watermark (2025)
- LLM Watermark Evasion via Bias Inversion (2025)
- Parallel Tokenizers: Rethinking Vocabulary Design for Cross-Lingual Transfer (2025)
- CATMark: A Context-Aware Thresholding Framework for Robust Cross-Task Watermarking in Large Language Models (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper


