Tom Lu's picture

42 6

Tom Lu

eigentom

·

https://eigentom.github.io

EigenTom

AI & ML interests

MLLM, Generative AI, Agentic RL

Recent Activity

upvoted a paper 20 days ago

Rethinking Chain-of-Thought Reasoning for Videos

upvoted a paper 25 days ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

updated a dataset about 1 month ago

eigentom/reformatted_deepreview

View all activity

Organizations

upvoted a paper 20 days ago

Rethinking Chain-of-Thought Reasoning for Videos

Paper • 2512.09616 • Published 22 days ago • 17

upvoted a paper 25 days ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

Paper • 2512.02014 • Published about 1 month ago • 69

upvoted a paper about 1 month ago

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published Nov 20, 2025 • 92

upvoted a paper about 2 months ago

Visual Spatial Tuning

Paper • 2511.05491 • Published Nov 7, 2025 • 51

upvoted 2 papers 2 months ago

Emu3.5: Native Multimodal Models are World Learners

Paper • 2510.26583 • Published Oct 30, 2025 • 108

VisCoder2: Building Multi-Language Visualization Coding Agents

Paper • 2510.23642 • Published Oct 24, 2025 • 21

upvoted 13 papers 3 months ago

WithAnyone: Towards Controllable and ID Consistent Image Generation

Paper • 2510.14975 • Published Oct 16, 2025 • 84

BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

Paper • 2510.10666 • Published Oct 12, 2025 • 27

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

Paper • 2509.25541 • Published Sep 29, 2025 • 140

UniVideo: Unified Understanding, Generation, and Editing for Videos

Paper • 2510.08377 • Published Oct 9, 2025 • 71

In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

Paper • 2510.05592 • Published Oct 7, 2025 • 106

Paper2Video: Automatic Video Generation from Scientific Papers

Paper • 2510.05096 • Published Oct 6, 2025 • 118

SparseD: Sparse Attention for Diffusion Language Models

Paper • 2509.24014 • Published Sep 28, 2025 • 30

Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia

Paper • 2503.01714 • Published Mar 3, 2025 • 5

A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports

Paper • 2510.02190 • Published Oct 2, 2025 • 18

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published Oct 1, 2025 • 89

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published Sep 26, 2025 • 70

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28, 2025 • 174

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Paper • 2509.26346 • Published Sep 30, 2025 • 18

upvoted a collection 3 months ago

Critique-Coder

Crique-Coder • 5 items • Updated Sep 30, 2025 • 3