TaskCraft: Automated Generation of Agentic Tasks Paper β’ 2506.10055 β’ Published 8 days ago β’ 25
Running 85 85 Financial Analyst AI π’ Analyze financial texts with speech recognition, summarization, and entity extraction
EasyText: Controllable Diffusion Transformer for Multilingual Text Rendering Paper β’ 2505.24417 β’ Published 20 days ago β’ 13
Alchemist: Turning Public Text-to-Image Data into Generative Gold Paper β’ 2505.19297 β’ Published 24 days ago β’ 75
TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action Paper β’ 2505.01583 β’ Published May 2 β’ 9
YoChameleon: Personalized Vision and Language Generation Paper β’ 2504.20998 β’ Published Apr 29 β’ 11
No application file Yolo Logo Detection π’ Logo detection using YOLOv7 with LogoDet-3K and Flickr Logos
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model Paper β’ 2504.07615 β’ Published Apr 10 β’ 32
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper β’ 2504.08685 β’ Published Apr 11 β’ 128
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper β’ 2504.02160 β’ Published Apr 2 β’ 37
SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper β’ 2504.02436 β’ Published Apr 3 β’ 37
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step Paper β’ 2504.01956 β’ Published Apr 2 β’ 40
BizGen: Advancing Article-level Visual Text Rendering for Infographics Generation Paper β’ 2503.20672 β’ Published Mar 26 β’ 14