Hinglish TTS 3B Model

This is a fine-tuned version of canopylabs/3b-hi-pretrain-research_release specialized for Hinglish (Hindi-English mixed) text-to-speech generation.

Model Details

  • Base Model: canopylabs/3b-hi-ft-research_release
  • Fine-tuning Method: LoRA with Unsloth (merged)
  • Languages: Hindi, English, Hinglish
  • Task: Text-to-Speech via audio token generation
  • Model Size: ~3B parameters

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
model_name = "Aaryan39/hinglish-tts-3b-ft-both"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Generate text
prompt = "Hello doston, main aapka dost hun"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=1200)

Fine-tuning Details

  • LoRA Rank: 64
  • LoRA Alpha: 64
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Training Framework: Unsloth

Audio Generation

This model generates audio tokens that need to be decoded using a SNAC (Scalable Neural Audio Codec) model:

from snac import SNAC

# Load SNAC decoder
snac_model = SNAC.from_pretrained("hubertsiuzdak/snac_24khz")

# Process generated tokens to audio codes and decode
# (See full implementation in the original training code)

Limitations

  • Requires SNAC model for audio generation
  • Optimized for Hinglish content
  • May not perform well on pure English or pure Hindi in some cases

Citation

If you use this model, please cite the original base model:

@misc{canopylabs-3b-hi,
  title={3B Hindi Pretrained Model},
  author={Canopy Labs},
  year={2024},
  url={https://huggingface.co/canopylabs/3b-hi-pretrain-research_release}
}
Downloads last month
15
Safetensors
Model size
3.3B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Aaryan39/hinglish-tts-3b-ft-both