FrameFusion: Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models Paper • 2501.01986 • Published Dec 30, 2024 • 1
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better Paper • 2404.02241 • Published Apr 2, 2024 • 2
FilMaster: Bridging Cinematic Principles and Generative AI for Automated Film Generation Paper • 2506.18899 • Published Jun 23 • 5
MBQ: Modality-Balanced Quantization for Large Vision-Language Models Paper • 2412.19509 • Published Dec 27, 2024
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification Paper • 2509.15591 • Published 9 days ago • 45
FilMaster: Bridging Cinematic Principles and Generative AI for Automated Film Generation Paper • 2506.18899 • Published Jun 23 • 5
Struct-Bench: A Benchmark for Differentially Private Structured Text Generation Paper • 2509.10696 • Published 15 days ago
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification Paper • 2509.15591 • Published 9 days ago • 45
Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles Paper • 2505.23590 • Published May 29 • 25
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing Paper • 2505.21600 • Published May 27 • 70
MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models Paper • 2503.14827 • Published Mar 19