Austral 24B Winton

Model banner
Trained by Delta-Vector

Overview

Austral 24B - Winton

Codex Finetune Mistral Based KTO enhanced Adventure/Roleplay generalist 24B Sized model

More than 1.5-metres tall, about six-metres long and up to 1000-kilograms heavy, Australovenator Wintonensis was a fast and agile hunter. The largest known Australian theropod.

This is a finetune of Codex 24B to be a generalist Roleplay/Adventure model. I've removed some of the "slops" that i noticed in an otherwise great model aswell as improving the general writing of the model, This was a multi-stage finetune, all previous checkpoints are released aswell. In testing it has shown to be a great model for Adventure cards & Roleplay, Often pushing the plot forward better then other models, While avoiding some of the slops you'd find in models from Drummer and Co.

Support my finetunes / Me on Kofi: https://Ko-fi.com/deltavector | Thank you to Auri for helping/Testing ♥

Quants

Quants Formats

  • GGUFFor use with LLama.cpp & Forks (coming soon!))
  • EXL3For use with TabbyAPI (Coming soon!)

Chat Format

This model utilizes ChatML.

<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant

Training

As the the Austral/Francois tradition, I built off another great finetune Codex-24B, I did 4 epochs ontop with roughly the same datamix as Francois-Huali/Austral 70B as a R128 Lora, then KTO alignment with a mix of Instruct/Small writing datasets and then finally another 4 epoch SFT with Rep_remover (Thanks Pocket!)

Config(Post-KTO SFT)
https://wandb.ai/new-eden/austral/runs/i85da0c6?nw=nwuserdeltavector

This model was trained over 4 epochs using 8 x A100s (Ty to my work, Cognitive Computations) for the base SFT, Then i used KTO to clean up some coherency issues for 1 epoch, then finally training for another 4 epochs on Rep_Remover to delete slops. Total was roughly 80 hours total.

Credits

TYSM to my friends: Auri, Lucy, Trappu, Alicat, Kubernetes Bad, Intervitens, NyxKrage & Kalomaze

Downloads last month
6
Safetensors
Model size
23.6B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with Delta-Vector/MS3.2-Austral-Winton.

Model tree for Delta-Vector/MS3.2-Austral-Winton

Dataset used to train Delta-Vector/MS3.2-Austral-Winton