liuzuyan's picture

1 12 8

liuzuyan

Zuyan

·

liuzuyan

AI & ML interests

None yet

Recent Activity

authored a paper 5 days ago

Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment

upvoted a paper 5 days ago

Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment

commented on a paper 5 days ago

Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment

View all activity

Organizations

None yet

Zuyan's activity

upvoted a paper 5 days ago

Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment

Paper • 2502.04328 • Published 5 days ago • 19

upvoted 2 collections 12 days ago

Oryx

Oryx: One Multi-Modal LLM for On-Demand Spatial-Temporal Understanding • 6 items • Updated Dec 11, 2024 • 16

Oryx-1.5

Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution • 4 items • Updated 27 days ago • 5

upvoted a paper about 2 months ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published Dec 13, 2024 • 139

upvoted a paper 2 months ago

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Paper • 2412.09596 • Published Dec 12, 2024 • 94

upvoted 2 papers 3 months ago

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Paper • 2411.14432 • Published Nov 21, 2024 • 23

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Paper • 2411.02265 • Published Nov 4, 2024 • 24

upvoted 3 papers 5 months ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 106

DC-Solver: Improving Predictor-Corrector Diffusion Sampler via Dynamic Compensation

Paper • 2409.03755 • Published Sep 5, 2024 • 3

Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution

Paper • 2409.12961 • Published Sep 19, 2024 • 25

upvoted a paper 6 months ago

Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model

Paper • 2408.00754 • Published Aug 1, 2024 • 22

upvoted a paper 7 months ago

Efficient Inference of Vision Instruction-Following Models with Elastic Cache

Paper • 2407.18121 • Published Jul 25, 2024 • 17