Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
ZhenyuLiu's picture
3 10

ZhenyuLiu

foggyforest

AI & ML interests

None yet

Recent Activity

authored a paper about 16 hours ago
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
upvoted a paper about 17 hours ago
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
commented on a paper about 17 hours ago
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
View all activity

Organizations

None yet

Collections 2

vllm data
  • PiTe: Pixel-Temporal Alignment for Large Video-Language Model

    Paper • 2409.07239 • Published Sep 11, 2024 • 14
speech
  • LLaMA-Omni: Seamless Speech Interaction with Large Language Models

    Paper • 2409.06666 • Published Sep 10, 2024 • 58

Papers 8

arxiv:2505.04921
arxiv:2502.19917
arxiv:2501.01028
arxiv:2410.10293

models 2

foggyforest/Qwen2-VL-2B-ViSA-80K

Image-Text-to-Text • Updated 29 days ago • 3

foggyforest/Qwen2-VL-2B-Instruction-ViSA-700K

Image-Text-to-Text • Updated Apr 1 • 4

datasets 2

foggyforest/ViSA_LlavaOV_80K

Viewer • Updated Apr 7 • 86.4k • 68

foggyforest/ViSA_LlavaOV_700K

Viewer • Updated Apr 7 • 694k • 1.27k
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs