File size: 2,968 Bytes
3bf930a 88890b1 3bf930a e234a9a 88890b1 e234a9a 88890b1 e234a9a 7b49682 77801cd 3bf930a 44c1faa 3bf930a 44c1faa 77801cd 3bf930a 48779bb 3bf930a c677947 3bf930a 48fe29a 3bf930a 48779bb 3bf930a 77801cd |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
---
license: gemma
license_name: license
license_link: LICENSE
metrics:
- bleu
- comet
base_model:
- ModelSpace/GemmaX2-28-2B-Pretrain
pipeline_tag: translation
library_name: transformers
language:
- ar
- bn
- cs
- de
- en
- es
- fa
- fr
- he
- hi
- id
- it
- ja
- km
- ko
- lo
- ms
- my
- nl
- pl
- pt
- ru
- th
- tl
- tr
- ur
- vi
- zh
---
# Model Card for GemmaX2-28
## Model Details
### Model Description
GemmaX2-28-2B-Pretrain is a language model that results from continual pretraining of Gemma2-2B on a mix of 56 billion tokens of monolingual and parallel data in 28 different languages — Arabic, Bengali, Czech, German, English, Spanish, Persian, French, Hebrew, Hindi, Indonesian, Italian, Japanese, Khmer, Korean, Lao, Malay, Burmese, Dutch, polish, Portuguese, Russian, Thai, Tagalog, Turkish, Urdu, Vietnamese, Chinese.
GemmaX2-28-2B-v0.1 is the model version of GemmaX2-28-2B-Pretrain after SFT.
- **Developed by:** Xiaomi
- **Model type:** A 2B parameter model base on Gemma2, we obtained GemmaX2-28-2B-Pretrain by continuing pre-training on a large amount of monolingual and parallel data. Afterward, GemmaX2-28-2B-v0.1 was derived through supervised fine-tuning on a small set of high-quality instruction data.
- **Language(s):** Arabic, Bengali, Czech, German, English, Spanish, Persian, French, Hebrew, Hindi, Indonesian, Italian, Japanese, Khmer, Korean, Lao, Malay, Burmese, Dutch, Polish, Portuguese, Russian, Thai, Tagalog, Turkish, Urdu, Vietnamese, Chinese.
- **License:** gemma
### Model Source
- paper: [Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study](https://arxiv.org/pdf/2502.02481)
### Model Performance
data:image/s3,"s3://crabby-images/fea4c/fea4ce22f68aec09abcae68aec05c055ea61fd0c" alt="Experimental Result"
## Run the model
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "ModelSpace/GemmaX2-28-2B-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
text = "Translate this from Chinese to English:\nChinese: 我爱机器翻译\nEnglish:"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Citation
```bibtex
@misc{cui2025multilingualmachinetranslationopen,
title={Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study},
author={Menglong Cui and Pengzhi Gao and Wei Liu and Jian Luan and Bin Wang},
year={2025},
eprint={2502.02481},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2502.02481},
}
```
## Limitations
GemmaX2-28-2B-v0.1 supports only the 28 most commonly used languages and does not guarantee powerful translation performance for other languages. Additionally, we will continue to improve GemmaX2-28-2B's translation performance, and future models will be release in due course. |