Deconstructing Attention: Investigating Design Principles for Effective Language Modeling Paper • 2510.11602 • Published 1 day ago • 10
Fine-Tuning on Noisy Instructions: Effects on Generalization and Performance Paper • 2510.03528 • Published 11 days ago • 15
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12, 2024 • 69
IntrEx: A Dataset for Modeling Engagement in Educational Conversations Paper • 2509.06652 • Published Sep 8 • 24
Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation? Paper • 2508.19827 • Published Aug 27 • 33