Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers Paper • 2506.07986 • Published 4 days ago • 17
SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning Paper • 2505.19099 • Published 20 days ago • 8
SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning Paper • 2505.19099 • Published 20 days ago • 8 • 3
Sentient Agent as a Judge: Evaluating Higher-Order Social Cognition in Large Language Models Paper • 2505.02847 • Published May 1 • 27
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning Paper • 2504.19162 • Published Apr 27 • 17
Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting Paper • 2012.04529 • Published Dec 8, 2020
Efficient Crowd Counting via Structured Knowledge Transfer Paper • 2003.10120 • Published Mar 23, 2020
Affordances-Oriented Planning using Foundation Models for Continuous Vision-Language Navigation Paper • 2407.05890 • Published Jul 8, 2024
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning Paper • 2504.19162 • Published Apr 27 • 17 • 2
S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning Paper • 2502.12853 • Published Feb 18 • 29
UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression Paper • 2212.02746 • Published Dec 6, 2022
GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning Paper • 2105.14517 • Published May 30, 2021
SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning Paper • 2504.19162 • Published Apr 27 • 17
LogicSolver: Towards Interpretable Math Word Problem Solving with Logical Prompt-enhanced Learning Paper • 2205.08232 • Published May 17, 2022
UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body Decoupling 3D Model Paper • 2110.15267 • Published Oct 28, 2021