Qwarkstar 4B Instruct (Preview)
Training complete!
This model is fine-tuned using Supervised Fine-Tuning (SFT) on 100k samples from the HuggingFaceTB/smoltalk
dataset.
It follows the ChatML input-output formatting template.
Training Details:
- Base Model:
qingy2024/Qwarkstar-4B
- Batch Size: 32 (2 H100s x 8 per GPU)
- Max Gradient Norm: 1.0
- Final Loss: ~0.59
- Downloads last month
- 44