The Pragmatic Mind of Machines: Tracing the Emergence of Pragmatic Competence in Large Language Models
Abstract
ALTPRAG dataset evaluates pragmatic competence in LLMs across training stages, showing improvements with increased scale and specific training methods.
Current large language models (LLMs) have demonstrated emerging capabilities in social intelligence tasks, including implicature resolution (Sravanthi et al. (2024)) and theory-of-mind reasoning (Shapira et al. (2024)), both of which require substantial pragmatic understanding. However, how LLMs acquire this competence throughout the training process remains poorly understood. In this work, we introduce ALTPRAG, a dataset grounded in the pragmatic concept of alternatives, designed to evaluate whether LLMs at different training stages can accurately infer nuanced speaker intentions. Each instance pairs two contextually appropriate but pragmatically distinct continuations, enabling fine-grained assessment of both pragmatic interpretation and contrastive reasoning. We systematically evaluate 22 LLMs across key training stages: pre-training, supervised fine-tuning (SFT), and preference optimization, to examine the development of pragmatic competence. Our results show that even base models exhibit notable sensitivity to pragmatic cues, which improves consistently with increases in model and data scale. Additionally, SFT and RLHF contribute further gains, particularly in cognitive-pragmatic reasoning. These findings highlight pragmatic competence as an emergent and compositional property of LLM training and offer new insights for aligning models with human communicative norms.
Community
Using the idea of "alternatives" to trace the emergence of pragmatic competence, indicating the importance of both pre-training and post-training.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models? (2025)
- SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models (2025)
- Thinking Out Loud: Do Reasoning Models Know When They're Right? (2025)
- Kongzi: A Historical Large Language Model with Fact Enhancement (2025)
- MR. Judge: Multimodal Reasoner as a Judge (2025)
- Reasoning Towards Fairness: Mitigating Bias in Language Models through Reasoning-Guided Fine-Tuning (2025)
- JudgeLRM: Large Reasoning Models as a Judge (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper