2 37 21

Wujian Peng(SII)

wjpoom

https://scholar.google.com/citations?user=GTuWk9YAAAAJ&hl=zh-CN

wjpoom

AI & ML interests

None yet

Recent Activity

upvoted a paper 17 days ago

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

upvoted a paper about 1 month ago

CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization

upvoted a paper 6 months ago

RoboOmni: Proactive Robot Manipulation in Omni-modal Context

View all activity

Organizations

upvoted a paper 17 days ago

LongCat-Next: Lexicalizing Modalities as Discrete Tokens

Paper • 2603.27538 • Published 19 days ago • 143

upvoted a paper about 1 month ago

CaTok: Taming Mean Flows for One-Dimensional Causal Image Tokenization

Paper • 2603.06449 • Published Mar 6 • 6

upvoted 2 papers 6 months ago

RoboOmni: Proactive Robot Manipulation in Omni-modal Context

Paper • 2510.23763 • Published Oct 27, 2025 • 62

LIBERO-Plus: In-depth Robustness Analysis of Vision-Language-Action Models

Paper • 2510.13626 • Published Oct 15, 2025 • 47

upvoted a paper 8 months ago

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published Aug 28, 2025 • 90

updated a model 10 months ago

wjpoom/SPEC-CLIP-ViT-B-32

Updated Jun 16, 2025 • 1

published a model 10 months ago

wjpoom/SPEC-CLIP-ViT-B-32

Updated Jun 16, 2025 • 1

upvoted 2 papers 11 months ago

Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment

Paper • 2505.18600 • Published May 24, 2025 • 49

CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models

Paper • 2505.12504 • Published May 18, 2025 • 24

upvoted a paper 12 months ago

Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning

Paper • 2505.03318 • Published May 6, 2025 • 94

authored a paper about 1 year ago

CoMP: Continual Multimodal Pre-training for Vision Foundation Models

Paper • 2503.18931 • Published Mar 24, 2025 • 30

upvoted 3 papers about 1 year ago

CoMP: Continual Multimodal Pre-training for Vision Foundation Models

Paper • 2503.18931 • Published Mar 24, 2025 • 30

World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning

Paper • 2503.10480 • Published Mar 13, 2025 • 57

Unified Reward Model for Multimodal Understanding and Generation

Paper • 2503.05236 • Published Mar 7, 2025 • 124

updated 2 datasets about 1 year ago

Inst-IT/Inst-It-Bench

Viewer • Updated Mar 3, 2025 • 4.07k • 40 • 1

Inst-IT/Inst-It-Dataset

Viewer • Updated Mar 1, 2025 • 72.5k • 44 • 10

updated a Space about 1 year ago

README

🐨

Boosting Multimodal Understanding at Instance-Level

published a Space about 1 year ago

README

🐨

Boosting Multimodal Understanding at Instance-Level

updated a collection about 1 year ago

Inst-IT Models

Collection

A series of LMMs finetuned with the Inst-IT Dataset, skilled in fine-grained image/video understanding at the instance-level. • 2 items • Updated Mar 17, 2025

Wujian Peng(SII)

AI & ML interests

Recent Activity

Organizations

wjpoom's activity

README

README