Lost in the Mix: Evaluating LLM Understanding of Code-Switched Text
Abstract
LLMs' comprehension and reasoning skills are evaluated under code-switching conditions, revealing that embedding English into other languages can improve understanding, while prompts and fine-tuning affect degradation mitigation differently.
Code-switching (CSW) is the act of alternating between two or more languages within a single discourse. This phenomenon is widespread in multilingual communities, and increasingly prevalent in online content, where users naturally mix languages in everyday communication. As a result, Large Language Models (LLMs), now central to content processing and generation, are frequently exposed to code-switched inputs. Given their widespread use, it is crucial to understand how LLMs process and reason about such mixed-language text. This paper presents a systematic evaluation of LLM comprehension under code-switching by generating CSW variants of established reasoning and comprehension benchmarks. While degradation is evident when foreign tokens disrupt English textx2013even under linguistic constraintsx2013embedding English into other languages often improves comprehension. Though prompting yields mixed results, fine-tuning offers a more stable path to degradation mitigation.
Community
This paper investigates how LLMs handle code-switched text by generating mixed-language versions of benchmarks, revealing that disrupting English with foreign tokens degrades performance, while embedding English into other languages can enhance it—highlighting fine-tuning as a more reliable strategy than prompting for mitigating such degradations.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- CodeMixBench: Evaluating Large Language Models on Code Generation with Code-Mixed Prompts (2025)
- CS-Sum: A Benchmark for Code-Switching Dialogue Summarization and the Limits of Large Language Models (2025)
- Can we train ASR systems on Code-switch without real code-switch data? Case study for Singapore's languages (2025)
- Parsing the Switch: LLM-Based UD Annotation for Complex Code-Switched and Low-Resource Languages (2025)
- Token Constraint Decoding Improves Robustness on Question Answering for Large Language Models (2025)
- Chain-of-Code Collapse: Reasoning Failures in LLMs via Adversarial Prompting in Code Generation (2025)
- CMLFormer: A Dual Decoder Transformer with Switching Point Learning for Code-Mixed Language Modeling (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper