Full Name's picture

Full Name PRO

Gatozu35

·

AI & ML interests

Text-to-Speech, Voice Conversion

Recent Activity

liked a model about 8 hours ago

eustlb/csm-1b

liked a dataset about 13 hours ago

MiniMaxAI/TTS-Multilingual-Test-Set

liked a dataset about 18 hours ago

Paradoxia/opendata-iisys-hui

View all activity

Organizations

Gatozu35's activity

upvoted a paper 9 days ago

RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale

Paper • 2505.03005 • Published 10 days ago • 27

upvoted an article 17 days ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

Jan 15

• 178

upvoted a paper 26 days ago

Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11 • 89

upvoted a paper about 1 month ago

AM-RADIO: Agglomerative Model -- Reduce All Domains Into One

Paper • 2312.06709 • Published Dec 10, 2023 • 2

upvoted a collection about 2 months ago

Orpheus TTS

TTS Towards Human-Sounding Speech • 2 items • Updated Mar 18 • 64

upvoted 2 papers about 2 months ago

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 147

Ultra-Sparse Memory Network

Paper • 2411.12364 • Published Nov 19, 2024 • 24

upvoted a paper 2 months ago

T-FREE: Tokenizer-Free Generative LLMs via Sparse Representations for Memory-Efficient Embeddings

Paper • 2406.19223 • Published Jun 27, 2024 • 11

upvoted a collection 2 months ago

OWLS: Scaling Laws for Speech Recognition and Translation

🦉 A suite of Whisper-style models from 250M to 18B parameters. Trained on up to 360K hours of data. 16k sampling rate. • 8 items • Updated 13 days ago • 6

upvoted a collection 3 months ago

Zonos-v0.1

3 items • Updated Feb 12 • 28

upvoted 2 papers 3 months ago

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning

Paper • 2305.10005 • Published May 17, 2023 • 3

Eager Updates For Overlapped Communication and Computation in DiLoCo

Paper • 2502.12996 • Published Feb 18 • 7

upvoted a collection 3 months ago

Deepseek Papers

Deepseek papers collection • 22 items • Updated about 20 hours ago • 196

upvoted 2 papers 3 months ago

QLIP: Text-Aligned Visual Tokenization Unifies Auto-Regressive Multimodal Understanding and Generation

Paper • 2502.05178 • Published Feb 7 • 10

NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks

Paper • 2408.13106 • Published Aug 23, 2024 • 1

upvoted 2 collections 4 months ago

Text to Speech (TTS)

Text to Speech (TTS) models compatible with txtai's TextToSpeech pipeline. • 7 items • Updated Jan 26 • 6

🪿 RWKV7

RWKV7 models • 13 items • Updated 3 days ago • 7

upvoted 2 papers 4 months ago

Emilia: A Large-Scale, Extensive, Multilingual, and Diverse Dataset for Speech Generation

Paper • 2501.15907 • Published Jan 27 • 17

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published Jan 16 • 72

upvoted a collection 4 months ago

Sound Datasets

Sound Datasets for ASR/ASV or some other tasks • 12 items • Updated Aug 28, 2024 • 1