Ye Fang PRO

aleafy

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

MM-IFEngine: Towards Multimodal Instruction Following

upvoted a paper 6 days ago

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography

updated a Space 9 days ago

aleafy/RelightVid

View all activity

Organizations

None yet

aleafy's activity

upvoted a paper 5 days ago

MM-IFEngine: Towards Multimodal Instruction Following

Paper • 2504.07957 • Published 6 days ago • 30

upvoted a paper 6 days ago

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography

Paper • 2504.07083 • Published 7 days ago • 21

updated a Space 9 days ago

RelightVid

🎥

Generate relit videos from foreground and background inputs

liked a Space 9 days ago

3DGen Arena

🐠

Generate 3D models from text or images

updated a model 14 days ago

aleafy/relightvid_models

Updated 14 days ago • 24

authored 3 papers 14 days ago

Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases

Paper • 2312.15011 • Published Dec 22, 2023 • 18

GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models

Paper • 2501.01428 • Published Jan 2

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting

Paper • 2501.16330 • Published Jan 27 • 2

upvoted a paper 14 days ago

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting

Paper • 2501.16330 • Published Jan 27 • 2

published a model 14 days ago

aleafy/relightvid_models

Updated 14 days ago • 24

liked a Space about 1 month ago

RelightVid

🎥

Generate relit videos from foreground and background inputs

published a Space about 1 month ago

RelightVid

🎥

Generate relit videos from foreground and background inputs

updated a model about 1 month ago

aleafy/RelightVid

Updated Mar 3

published a model about 1 month ago

aleafy/RelightVid

Updated Mar 3

upvoted a paper about 2 months ago

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Paper • 2502.18411 • Published Feb 25 • 73

upvoted a paper 3 months ago

BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning

Paper • 2501.03226 • Published Jan 6 • 45

upvoted 2 papers 4 months ago

FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models

Paper • 2412.07674 • Published Dec 10, 2024 • 20

X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models

Paper • 2412.01824 • Published Dec 2, 2024 • 66

upvoted 2 papers 6 months ago

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Paper • 2410.17637 • Published Oct 23, 2024 • 37

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction

Paper • 2410.17247 • Published Oct 22, 2024 • 48