Tang's picture

132

Tang

tommysally

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation

upvoted a paper 1 day ago

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

upvoted a paper 1 day ago

AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO

View all activity

Organizations

None yet

tommysally's activity

upvoted 10 papers 1 day ago

Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation

Paper • 2502.14846 • Published 1 day ago • 9

MLGym: A New Framework and Benchmark for Advancing AI Research Agents

Paper • 2502.14499 • Published 2 days ago • 129

AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO

Paper • 2502.14669 • Published 1 day ago • 5

Dynamic Concepts Personalization from Single Videos

Paper • 2502.14844 • Published 1 day ago • 12

S*: Test Time Scaling for Code Generation

Paper • 2502.14382 • Published 2 days ago • 40

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published 1 day ago • 28

PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC

Paper • 2502.14282 • Published 2 days ago • 12

LongWriter-V: Enabling Ultra-Long and High-Fidelity Generation in Vision-Language Models

Paper • 2502.14834 • Published 1 day ago • 20

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published 1 day ago • 84

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published 1 day ago • 84

upvoted 7 papers 5 days ago

Diverse Inference and Verification for Advanced Reasoning

Paper • 2502.09955 • Published 8 days ago • 16

Large Language Diffusion Models

Paper • 2502.09992 • Published 8 days ago • 74

STMA: A Spatio-Temporal Memory Agent for Long-Horizon Embodied Task Planning

Paper • 2502.10177 • Published 8 days ago • 5

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published 8 days ago • 49

Region-Adaptive Sampling for Diffusion Transformers

Paper • 2502.10389 • Published 8 days ago • 52

ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models

Paper • 2502.09696 • Published 9 days ago • 36

MM-RLHF: The Next Step Forward in Multimodal LLM Alignment

Paper • 2502.10391 • Published 8 days ago • 29

upvoted a paper 6 days ago

Retrieval-augmented Large Language Models for Financial Time Series Forecasting

Paper • 2502.05878 • Published 13 days ago • 38

upvoted 2 papers 7 days ago

A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods

Paper • 2502.01618 • Published 19 days ago • 9

Large Language Model Guided Self-Debugging Code Generation

Paper • 2502.02928 • Published 17 days ago • 11