HKUST Audio

non-profit

http://wei-xue.com

wxue_audio

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

DangeZy updated a dataset 2 days ago

HKUSTAudio/Audio-FLAN-Dataset

lmxue updated a dataset 3 days ago

HKUSTAudio/Audio-FLAN-Dataset

Zeyue7 updated a model 13 days ago

HKUSTAudio/AudioX

View all activity

HKUSTAudio's activity

DangeZy

updated a dataset 2 days ago

HKUSTAudio/Audio-FLAN-Dataset

Preview • Updated 2 days ago • 1.09k • 28

lmxue

updated a dataset 3 days ago

HKUSTAudio/Audio-FLAN-Dataset

Preview • Updated 2 days ago • 1.09k • 28

Zeyue7

updated a model 13 days ago

HKUSTAudio/AudioX

Updated 13 days ago • 1.02k • 28

Zeyue7

published a model 13 days ago

HKUSTAudio/AudioX

Updated 13 days ago • 1.02k • 28

Zeyue7

authored 10 papers about 1 month ago

ChatMusician: Understanding and Generating Music Intrinsically with LLM

Paper • 2402.16153 • Published Feb 25, 2024 • 61

Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners

Paper • 2402.17723 • Published Feb 27, 2024 • 16

ComposerX: Multi-Agent Symbolic Music Composition with LLMs

Paper • 2404.18081 • Published Apr 28, 2024 • 2

Mixed Neural Voxels for Fast Multi-view Video Synthesis

Paper • 2212.00190 • Published Dec 1, 2022

MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions

Paper • 2407.20962 • Published Jul 30, 2024

Foundation Models for Music: A Survey

Paper • 2408.14340 • Published Aug 26, 2024 • 45

Audio-FLAN: A Preliminary Release

Paper • 2502.16584 • Published Feb 23 • 36

VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling

Paper • 2406.04321 • Published Jun 6, 2024 • 1

YuE: Scaling Open Foundation Models for Long-Form Music Generation

Paper • 2503.08638 • Published Mar 11 • 62

AudioX: Diffusion Transformer for Anything-to-Audio Generation

Paper • 2503.10522 • Published Mar 13 • 22

lmxue

authored a paper about 1 month ago

Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens

Paper • 2503.01710 • Published Mar 3 • 5

Xinsheng-Wang

authored a paper about 1 month ago

Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens

Paper • 2503.01710 • Published Mar 3 • 5

a43992899

authored 4 papers about 1 month ago

Chinese Open Instruction Generalist: A Preliminary Release

Paper • 2304.07987 • Published Apr 17, 2023 • 2

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Paper • 2311.16502 • Published Nov 27, 2023 • 35

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

Paper • 2306.17103 • Published Jun 29, 2023 • 1

CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models

Paper • 2402.13109 • Published Feb 20, 2024

AI & ML interests

Recent Activity

Team members 8

HKUSTAudio's activity