Audio - a CCMat Collection

CCMat 's Collections

RL

LoRA

Visual Consistency

ID Preservation

Inference Improvements

Adapters & Controls

Personalization

Depth & Segmentation

Computer Vision

3D & 360 & World Models

Video

Mixture of Experts

Transformers & Attention

StateSpaceModels

LLMs

Audio

Agents

Data

UI

toread

VLM

Audio

updated May 14, 2024

BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data

Paper • 2402.08093 • Published Feb 12, 2024 • 62