Text Generation
Transformers
Safetensors
Russian
qwen2
conversational
text-generation-inference
attn-signs commited on
Commit
64c4735
·
verified ·
1 Parent(s): 0bd4afc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -8,12 +8,12 @@ datasets:
8
 
9
  # Watari 32B (V2)
10
 
11
- - [EN]
12
  Qwen2.5-based model, adapted for russian text generation tasks.
13
  - The model has extended tokenizer and proper adapted chat template.
14
  - The model was trained using LoRA adapters.
15
  - The model was trained for **2 stages**
16
- - [RU]
17
  Finetune версия Qwen2.5, адаптированная для генерации русского текста.
18
  - Модель имеет расширенный токенайзер и правильный адаптированный чат темплейт (произведена работа над ошибками).
19
  - Модель была обучена с использованием низкоранговых адаптеров LoRA.
@@ -24,7 +24,7 @@ Finetune версия Qwen2.5, адаптированная для генера
24
  - Watari-32b-v0
25
 
26
  ## Model Details / Детализация модели
27
- - [EN]
28
  LoRA supervised finetuning version was performed on **2xA100 NVIDIA** GPUs for **~8 days**.
29
  **Datasets used:**
30
  - GrandMaster [Vikhrmodels/GrandMaster-PRO-MAX] (0.6 epochs)
@@ -34,7 +34,7 @@ LoRA supervised finetuning version was performed on **2xA100 NVIDIA** GPUs for *
34
  The model has extended tokenizer based on arxiv paper and works of RefalMachine (RuAdapt / Moscow State University).
35
  **Huge thanks to Mikhail Tikhomirov for hard scientific work and tokenizer extension methods developed.**
36
  The model generation in russian is 60% more cheaper and faster due to the extended tokenizer (see the research at the end).
37
- - [RU]
38
  SFT LoRA обучение было выполнено на **двух NVIDIA A100**, обучение длилось около **8 дней**.
39
  **Использованные датасеты:**
40
  - GrandMaster [Vikhrmodels/GrandMaster-PRO-MAX] (0.6 эпохи)
 
8
 
9
  # Watari 32B (V2)
10
 
11
+ ### [EN]
12
  Qwen2.5-based model, adapted for russian text generation tasks.
13
  - The model has extended tokenizer and proper adapted chat template.
14
  - The model was trained using LoRA adapters.
15
  - The model was trained for **2 stages**
16
+ ### [RU]
17
  Finetune версия Qwen2.5, адаптированная для генерации русского текста.
18
  - Модель имеет расширенный токенайзер и правильный адаптированный чат темплейт (произведена работа над ошибками).
19
  - Модель была обучена с использованием низкоранговых адаптеров LoRA.
 
24
  - Watari-32b-v0
25
 
26
  ## Model Details / Детализация модели
27
+ ### [EN]
28
  LoRA supervised finetuning version was performed on **2xA100 NVIDIA** GPUs for **~8 days**.
29
  **Datasets used:**
30
  - GrandMaster [Vikhrmodels/GrandMaster-PRO-MAX] (0.6 epochs)
 
34
  The model has extended tokenizer based on arxiv paper and works of RefalMachine (RuAdapt / Moscow State University).
35
  **Huge thanks to Mikhail Tikhomirov for hard scientific work and tokenizer extension methods developed.**
36
  The model generation in russian is 60% more cheaper and faster due to the extended tokenizer (see the research at the end).
37
+ ### [RU]
38
  SFT LoRA обучение было выполнено на **двух NVIDIA A100**, обучение длилось около **8 дней**.
39
  **Использованные датасеты:**
40
  - GrandMaster [Vikhrmodels/GrandMaster-PRO-MAX] (0.6 эпохи)