chameleon-lizard
/

Qwen-2.5-7B-DTF-SFT

Text Generation

Model card Files Files and versions

chameleon-lizard commited on Feb 7

Commit

0703f8b

·

verified ·

1 Parent(s): 27cdbd4

Update README.md

Files changed (1) hide show

README.md +61 -3

README.md CHANGED Viewed

@@ -1,3 +1,61 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+datasets:
+- Vikhrmodels/GrandMaster-PRO-MAX
+- SubMaroon/DTF_Comments_Responses_Counts
+language:
+- ru
+base_model:
+- chameleon-lizard/Qwen-2.5-7B-DTF
+pipeline_tag: text-generation
+---
+SFT version of chameleon-lizard/Qwen2.5-7B-DFT model using unsloth's low rank adaptation. Training was carried out on [Vikhrmodels/GrandMaster-PRO-MAX](https://huggingface.co/datasets/Vikhrmodels/GrandMaster-PRO-MAX) and on a subset of [SubMaroon/DTF_Comments_Responses_Counts](https://huggingface.co/datasets/SubMaroon/DTF_Comments_Responses_Counts). The adapter is already merged with the model.
+For finetuning, we've added 20% of Vikhrmodels/GrandMaster-PRO-MAX's size in the form of DTF posts, response comments and child comments. Rough estimate of the dataset size is 125M tokens.
+LoRA hyperparameters:
+```
+r=32
+target_modules=[
+    "q_proj",
+    "k_proj",
+    "v_proj",
+    "o_proj",
+    "gate_proj",
+    "up_proj",
+    "down_proj",
+]
+lora_alpha=16
+lora_dropout=0
+bias="none"
+use_gradient_checkpointing='unsloth'
+use_rslora=True
+random_state=42
+```
+Training hyperparameters:
+```
+num_train_epochs=2
+train_batch_size=1
+gradient_accumulation_steps=128
+gradient_checkpointing=False
+optim="adamw_8bit"
+weight_decay=4e-2
+bf16=True
+learning_rate=5e-5
+lr_scheduler_type="cosine"
+packing=True,
+seed=42
+```
+Training time:
+- NVidia RTX 3090ti: ~52 hours
+[Wandb](https://wandb.ai/a_okshus/DTF_comments/runs/zni3o3li)
+[GitHub: TODO]()