Speech Tokenizer
Collection
Multilingual discrete speech tokenizer for LLM.
•
6 items
•
Updated
Add a convolution layer with stride 2 to introduce 25 TPS. This model use to introduce VQ for projection layer later.
WanDB at https://wandb.ai/huseinzol05/whisperconv?nw=nwuserhuseinzol05
Evaluate on malaysia-ai/common_voice_17_0/test, with some conditions,
<|startoftranscript|><|{lang}|><|transcribe|><|notimestamps|>
.
Source code at https://github.com/mesolitica/malaya-speech/tree/master/session/whisper-conv