2 17 4

Yifei Li

JoeLeelyf

https://joeleelyf.github.io/

JoeLeelyf

AI & ML interests

MLLMs, Deepfake Detection, Computer Vision

Recent Activity

upvoted a paper 9 days ago

CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning

upvoted a paper 13 days ago

SIM-CoT: Supervised Implicit Chain-of-Thought

upvoted a paper 2 months ago

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

View all activity

Organizations

None yet

upvoted a paper 9 days ago

CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning

Paper • 2509.22647 • Published 11 days ago • 31

upvoted a paper 13 days ago

SIM-CoT: Supervised Implicit Chain-of-Thought

Paper • 2509.20317 • Published 13 days ago • 38

upvoted 2 papers 2 months ago

SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience

Paper • 2508.04700 • Published Aug 6 • 52

Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models

Paper • 2508.00819 • Published Aug 1 • 62

upvoted 2 papers 3 months ago

SeC: Advancing Complex Video Object Segmentation via Progressive Concept Construction

Paper • 2507.15852 • Published Jul 21 • 38

ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing

Paper • 2506.19848 • Published Jun 24 • 26

updated a dataset 5 months ago

JoeLeelyf/NeXT-IMDL

Preview • Updated May 15 • 6

published a dataset 5 months ago

JoeLeelyf/NeXT-IMDL

Preview • Updated May 15 • 6

upvoted a paper 6 months ago

MM-IFEngine: Towards Multimodal Instruction Following

Paper • 2504.07957 • Published Apr 10 • 35

updated a dataset 7 months ago

JoeLeelyf/OVO-Bench

Viewer • Updated Mar 23 • 2.06k • 1.96k • 8

New activity in JoeLeelyf/OVO-Bench 7 months ago

Maybe an error in "realtime"

#3 opened 8 months ago by

gogorunrun

upvoted a paper 7 months ago

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 74

liked a dataset 7 months ago

THUdyh/Ola-Data

Viewer • Updated Feb 24 • 363k • 312 • 8

upvoted a paper 8 months ago

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Paper • 2502.13128 • Published Feb 18 • 41

liked a dataset 8 months ago

HuggingFaceFV/finevideo

Viewer • Updated Dec 16, 2024 • 39.5k • 12.6k • 328

liked a model 8 months ago

parler-tts/parler-tts-large-v1

Text-to-Speech • 2B • Updated Nov 22, 2024 • 221k • 261

upvoted 2 papers 8 months ago

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Paper • 2502.05173 • Published Feb 7 • 65

Redundancy Principles for MLLMs Benchmarks

Paper • 2501.13953 • Published Jan 20 • 29

New activity in JoeLeelyf/OVO-Bench 9 months ago

Add task category, paper, code and project page link

#2 opened 9 months ago by

nielsr

authored a paper 9 months ago

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Paper • 2501.05510 • Published Jan 9 • 43