arxiv:2601.21590
xiaotong
xtongji
AI & ML interests
None yet
Recent Activity
upvoted a paper about 1 month ago
Multi-Task GRPO: Reliable LLM Reasoning Across Tasks authored a paper about 2 months ago
Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem ProvingOrganizations
None yet