Zhongpai Gao

gaozhongpai

Gaozhongpai

AI & ML interests

3D computer vision

Recent Activity

upvoted a paper 3 days ago

Durian: Dual Reference-guided Portrait Animation with Attribute Transfer

upvoted a paper 5 days ago

FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

liked a Space 6 days ago

IndexTeam/IndexTTS

View all activity

Organizations

upvoted a paper 3 days ago

Durian: Dual Reference-guided Portrait Animation with Attribute Transfer

Paper • 2509.04434 • Published 4 days ago • 7

upvoted a paper 5 days ago

FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Paper • 2412.01064 • Published Dec 2, 2024 • 46

liked a Space 6 days ago

190

IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

🎙

Generate audio from text using a reference audio sample

upvoted a paper 7 days ago

TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis

Paper • 2508.13618 • Published 20 days ago • 17

upvoted a paper 10 days ago

Multi-View 3D Point Tracking

Paper • 2508.21060 • Published 11 days ago • 20

upvoted 2 papers 11 days ago

Gaze into the Heart: A Multi-View Video Dataset for rPPG and Health Biomarkers Estimation

Paper • 2508.17924 • Published 14 days ago • 14

MIDAS: Multimodal Interactive Digital-human Synthesis via Real-time Autoregressive Video Generation

Paper • 2508.19320 • Published 13 days ago • 27

liked a model 19 days ago

hustvl/vavae-imagenet256-f16d32-dinov2

Text-to-Image • Updated Feb 17 • 5

upvoted a paper 20 days ago

4DNeX: Feed-Forward 4D Generative Modeling Made Easy

Paper • 2508.13154 • Published 21 days ago • 58

upvoted a paper 21 days ago

FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation

Paper • 2508.11255 • Published 24 days ago • 10

upvoted 4 papers 25 days ago

DisTime: Distribution-based Time Representation for Video Large Language Models

Paper • 2505.24329 • Published May 30 • 1

DeepVideo-R1: Video Reinforcement Fine-Tuning via Difficulty-aware Regressive GRPO

Paper • 2506.07464 • Published Jun 9 • 14

VideoITG: Multimodal Video Understanding with Instructed Temporal Grounding

Paper • 2507.13353 • Published Jul 17 • 1

LLaVA-Scissor: Token Compression with Semantic Connected Components for Video LLMs

Paper • 2506.21862 • Published Jun 27 • 36

upvoted 2 papers about 1 month ago

DreamVVT: Mastering Realistic Video Virtual Try-On in the Wild via a Stage-Wise Diffusion Transformer Framework

Paper • 2508.02807 • Published Aug 4 • 13

Phi-Ground Tech Report: Advancing Perception in GUI Grounding

Paper • 2507.23779 • Published Jul 31 • 44

upvoted 3 papers about 2 months ago

MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second

Paper • 2507.10065 • Published Jul 14 • 24

SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation

Paper • 2507.09862 • Published Jul 14 • 49

CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering

Paper • 2507.08776 • Published Jul 11 • 54

upvoted a paper 2 months ago

4DSloMo: 4D Reconstruction for High Speed Scene with Asynchronous Capture

Paper • 2507.05163 • Published Jul 7 • 41

Zhongpai Gao

AI & ML interests

Recent Activity

Organizations

gaozhongpai's activity

IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System