Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning Paper • 2506.07044 • Published 20 days ago • 105
MedBrowseComp: Benchmarking Medical Deep Research and Computer Use Paper • 2505.14963 • Published May 20 • 1
Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models Paper • 2505.13774 • Published May 19 • 1
When Models Reason in Your Language: Controlling Thinking Trace Language Comes at the Cost of Accuracy Paper • 2505.22888 • Published about 1 month ago • 6
Measuring the Faithfulness of Thinking Drafts in Large Reasoning Models Paper • 2505.13774 • Published May 19 • 1
MedBrowseComp: Benchmarking Medical Deep Research and Computer Use Paper • 2505.14963 • Published May 20 • 1
When Models Reason in Your Language: Controlling Thinking Trace Language Comes at the Cost of Accuracy Paper • 2505.22888 • Published about 1 month ago • 6
XReasoning - models Collection https://arxiv.org/abs/2505.22888 ds - means continue post-training on deepseek distilled qwen math 7b limo-{language}-{amount of data} • 19 items • Updated 24 days ago • 1