Resources for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
Xin Lai
xinlai
AI & ML interests
Multimodal LLM, LLM Reasoning, Point Cloud Segmentation, Image Segmentation
Recent Activity
upvoted
a
paper
about 17 hours ago
MMSearch-R1: Incentivizing LMMs to Search
upvoted
a
paper
6 months ago
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Organizations
None yet