Update README.md
Browse files
README.md
CHANGED
@@ -8,12 +8,12 @@ datasets:
|
|
8 |
|
9 |
# Watari 32B (V2)
|
10 |
|
11 |
-
|
12 |
Qwen2.5-based model, adapted for russian text generation tasks.
|
13 |
- The model has extended tokenizer and proper adapted chat template.
|
14 |
- The model was trained using LoRA adapters.
|
15 |
- The model was trained for **2 stages**
|
16 |
-
|
17 |
Finetune версия Qwen2.5, адаптированная для генерации русского текста.
|
18 |
- Модель имеет расширенный токенайзер и правильный адаптированный чат темплейт (произведена работа над ошибками).
|
19 |
- Модель была обучена с использованием низкоранговых адаптеров LoRA.
|
@@ -24,7 +24,7 @@ Finetune версия Qwen2.5, адаптированная для генера
|
|
24 |
- Watari-32b-v0
|
25 |
|
26 |
## Model Details / Детализация модели
|
27 |
-
|
28 |
LoRA supervised finetuning version was performed on **2xA100 NVIDIA** GPUs for **~8 days**.
|
29 |
**Datasets used:**
|
30 |
- GrandMaster [Vikhrmodels/GrandMaster-PRO-MAX] (0.6 epochs)
|
@@ -34,7 +34,7 @@ LoRA supervised finetuning version was performed on **2xA100 NVIDIA** GPUs for *
|
|
34 |
The model has extended tokenizer based on arxiv paper and works of RefalMachine (RuAdapt / Moscow State University).
|
35 |
**Huge thanks to Mikhail Tikhomirov for hard scientific work and tokenizer extension methods developed.**
|
36 |
The model generation in russian is 60% more cheaper and faster due to the extended tokenizer (see the research at the end).
|
37 |
-
|
38 |
SFT LoRA обучение было выполнено на **двух NVIDIA A100**, обучение длилось около **8 дней**.
|
39 |
**Использованные датасеты:**
|
40 |
- GrandMaster [Vikhrmodels/GrandMaster-PRO-MAX] (0.6 эпохи)
|
|
|
8 |
|
9 |
# Watari 32B (V2)
|
10 |
|
11 |
+
### [EN]
|
12 |
Qwen2.5-based model, adapted for russian text generation tasks.
|
13 |
- The model has extended tokenizer and proper adapted chat template.
|
14 |
- The model was trained using LoRA adapters.
|
15 |
- The model was trained for **2 stages**
|
16 |
+
### [RU]
|
17 |
Finetune версия Qwen2.5, адаптированная для генерации русского текста.
|
18 |
- Модель имеет расширенный токенайзер и правильный адаптированный чат темплейт (произведена работа над ошибками).
|
19 |
- Модель была обучена с использованием низкоранговых адаптеров LoRA.
|
|
|
24 |
- Watari-32b-v0
|
25 |
|
26 |
## Model Details / Детализация модели
|
27 |
+
### [EN]
|
28 |
LoRA supervised finetuning version was performed on **2xA100 NVIDIA** GPUs for **~8 days**.
|
29 |
**Datasets used:**
|
30 |
- GrandMaster [Vikhrmodels/GrandMaster-PRO-MAX] (0.6 epochs)
|
|
|
34 |
The model has extended tokenizer based on arxiv paper and works of RefalMachine (RuAdapt / Moscow State University).
|
35 |
**Huge thanks to Mikhail Tikhomirov for hard scientific work and tokenizer extension methods developed.**
|
36 |
The model generation in russian is 60% more cheaper and faster due to the extended tokenizer (see the research at the end).
|
37 |
+
### [RU]
|
38 |
SFT LoRA обучение было выполнено на **двух NVIDIA A100**, обучение длилось около **8 дней**.
|
39 |
**Использованные датасеты:**
|
40 |
- GrandMaster [Vikhrmodels/GrandMaster-PRO-MAX] (0.6 эпохи)
|