claudeli1234's picture

112 5

claudeli1234

claudeli1234

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Interpretable non-linear dimensionality reduction using gaussian weighted linear transformation

upvoted a paper 1 day ago

TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos

upvoted a paper 1 day ago

3DV-TON: Textured 3D-Guided Consistent Video Try-on via Diffusion Models

View all activity

Organizations

None yet

claudeli1234's activity

upvoted 10 papers 1 day ago

Interpretable non-linear dimensionality reduction using gaussian weighted linear transformation

Paper • 2504.17601 • Published 3 days ago • 2

TimeChat-Online: 80% Visual Tokens are Naturally Redundant in Streaming Videos

Paper • 2504.17343 • Published 3 days ago • 5

3DV-TON: Textured 3D-Guided Consistent Video Try-on via Diffusion Models

Paper • 2504.17414 • Published 3 days ago • 5

Boosting Generative Image Modeling via Joint Image-Feature Synthesis

Paper • 2504.16064 • Published 5 days ago • 7

Distilling semantically aware orders for autoregressive image generation

Paper • 2504.17069 • Published 4 days ago • 4

DyMU: Dynamic Merging and Virtual Unmerging for Efficient VLMs

Paper • 2504.17040 • Published 4 days ago • 8

Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models

Paper • 2504.17789 • Published 3 days ago • 12

RefVNLI: Towards Scalable Evaluation of Subject-driven Text-to-image Generation

Paper • 2504.17502 • Published 3 days ago • 51

Step1X-Edit: A Practical Framework for General Image Editing

Paper • 2504.17761 • Published 3 days ago • 68

DreamO: A Unified Framework for Image Customization

Paper • 2504.16915 • Published 4 days ago • 16

upvoted 2 papers 2 days ago

DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning

Paper • 2504.14509 • Published 7 days ago • 43

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Paper • 2504.16030 • Published 5 days ago • 27

upvoted 5 papers 3 days ago

RealisDance-DiT: Simple yet Strong Baseline towards Controllable Character Animation in the Wild

Paper • 2504.14977 • Published 6 days ago • 9

MR. Video: "MapReduce" is the Principle for Long Video Understanding

Paper • 2504.16082 • Published 5 days ago • 5

Vidi: Large Multimodal Models for Video Understanding and Editing

Paper • 2504.15681 • Published 5 days ago • 14

Personalized Text-to-Image Generation with Auto-Regressive Models

Paper • 2504.13162 • Published 10 days ago • 17

Describe Anything: Detailed Localized Image and Video Captioning

Paper • 2504.16072 • Published 5 days ago • 51

upvoted 3 papers 4 days ago

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Paper • 2410.13276 • Published Oct 17, 2024 • 30

Context-Aware Token Selection and Packing for Enhanced Vision Transformer

Paper • 2410.23608 • Published Oct 31, 2024 • 1

Ltri-LLM: Streaming Long Context Inference for LLMs with Training-Free Dynamic Triangular Attention Pattern

Paper • 2412.04757 • Published Dec 6, 2024 • 1