NanoT5 Base Malaysian Translation V2.1

Finetuned https://huggingface.co/mesolitica/nanot5-base-malaysian-cased using 2048 context length on 9B tokens of translation dataset.

  • This model able to translate from localize text into standard text.
  • This model able to reverse translate from standard to localize text, suitable for text augmentation.
  • This model able to translate code.
  • This model natively code switching.
  • This model should maintain \n, \t, \r as it is.
  • Better Science and Math context translation compared to V2.
  • Better Manglish translation compared to V2.
  • Better Cantonese translation compared to V2.
  • Better Tamil and Tanglish translation compared to V2.

Wandb at https://wandb.ai/huseinzol05/nanot5-base-malaysian-cased-translation-v6-multipack-post

Downloads last month
114
Safetensors
Model size
248M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for mesolitica/nanot5-base-malaysian-translation-v2.1

Finetuned
(2)
this model

Collection including mesolitica/nanot5-base-malaysian-translation-v2.1