Seeing Voices: Generating A-Roll Video from Audio with Mirage Paper β’ 2506.08279 β’ Published Jun 9 β’ 28
meta-llama/Llama-4-Maverick-17B-128E-Instruct Image-Text-to-Text β’ 402B β’ Updated May 22 β’ 31.7k β’ β’ 377
meta-llama/Llama-4-Scout-17B-16E-Instruct Image-Text-to-Text β’ 109B β’ Updated May 22 β’ 552k β’ β’ 1k
Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper β’ 2504.00557 β’ Published Apr 1 β’ 15
Efficient LLaMA-3.2-Vision by Trimming Cross-attended Visual Features Paper β’ 2504.00557 β’ Published Apr 1 β’ 15