Austral-70B-Winton / README.md
Delta-Vector's picture
Update README.md
daa4ccd verified
metadata
license: apache-2.0
base_model:
  - Delta-Vector/Austral-70B-Preview
language:
  - en
library_name: transformers
tags:
  - roleplay
  - finetune
  - axolotl
  - adventure
  - creative-writing
  - llama
  - 70B
  - KTO
  - RL

Austral 24B Winton

Model banner
Trained by Delta-Vector

Overview

Austral 24B - Winton

Vulpecula Finetune Llama Based KTO enhanced Adventure/Roleplay generalist 70B Sized model

More than 1.5-metres tall, about six-metres long and up to 1000-kilograms heavy, Australovenator Wintonensis was a fast and agile hunter. The largest known Australian theropod.

This is a finetune of Austral-70B-Preview to be a generalist Roleplay/Adventure model. This is just a KTO RL train ontop of Austral-Preview, I've improved coherency and Intelligence while keeping the creative side of the model while reducing some of the 'slops' you'd encounter in a Drummer model ;)

Support my finetunes / Me on Kofi: https://Ko-fi.com/deltavector | Thank you to Auri for helping/Testing ♥

FYI - While i can't stop people from merging this model and keeping it a secret, I request all models using this model to be merged to have accessible mergekit configs.

Quants

Quants Formats

  • GGUFFor use with LLama.cpp & Forks(coming soon!))
  • EXL3For use with TabbyAPI (Coming soon!)
  • EXL2For use with TabbyAPI - Faster on Ampere (Coming soon!)

Chat Format

This model utilizes ChatML.

<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
<|im_start|>assistant

Training

As goes the Austral tradition - I trained on another great finetune by Sao, Vulpecula - I trained it as a 16bit R128 lora for 2 epochs, This left a very underfit but promising model, For Winton i KTO'd the model to help with coherency using a mix of instruct/Writing datasets.

Config
https://wandb.ai/new-eden/austral/artifacts/axolotl-config/config-3dlacmq5/v0/files/axolotl_config_j6uj7id6.yml

This model was trained over 2 epochs using 8 x A100s for the base SFT, Then i used KTO to clean up some coherency issues for 1 epoch, Total was roughly 48 hours total.

Credits

TYSM to my friends: Auri, Zerofata, Lucy, Trappu, Alicat, Kubernetes Bad, Intervitens, NyxKrage & Kalomaze