yongxinzhu's picture

10 4

yongxinzhu

youngsheen

·

https://youngsheen.github.io/

youngsheen

AI & ML interests

None yet

Organizations

authored 2 papers 8 months ago

Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective

Paper • 2410.12490 • Published Oct 16, 2024 • 8

Addressing Representation Collapse in Vector Quantized Models with One Linear Layer

Paper • 2411.02038 • Published Nov 4, 2024

authored 7 papers 12 months ago

Difformer: Empowering Diffusion Models on the Embedding Space for Text Generation

Paper • 2212.09412 • Published Dec 19, 2022 • 1

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Paper • 2406.07476 • Published Jun 11, 2024 • 38

Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction

Paper • 2406.12707 • Published Jun 18, 2024

Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation

Paper • 2205.10884 • Published May 22, 2022

Locate Then Generate: Bridging Vision and Language with Bounding Box for Scene-Text VQA

Paper • 2304.01603 • Published Apr 4, 2023

DiffS2UT: A Semantic Preserving Diffusion Model for Textless Direct Speech-to-Speech Translation

Paper • 2310.17570 • Published Oct 26, 2023

Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer

Paper • 2406.00976 • Published Jun 3, 2024