Text Generation
Transformers
Safetensors
mistral
conversational
text-generation-inference

PAINTED FANTASY VISAGE

Mistrall Small 3.2 Upscaled 33B

image/png

Overview

Another experimental release. Mistral Small 3.2 24B upscaled by 18 layers to create a 33.6B model. This model then went through pretraining, SFT & DPO.

Can't guarantee the Mistral 3.2 repetition issues are fixed, but this model seems to be less repetitive than my previous attempt.

This is an uncensored creative model intended to excel at character driven RP / ERP where characters are portrayed creatively and proactively.

SillyTavern Settings

Recommended Roleplay Format

> Actions: In plaintext
> Dialogue: "In quotes"
> Thoughts: *In asterisks*

Recommended Samplers

> Temp: 0.6
> MinP: 0.03 - 0.05
> TopP: 0.95 - 1.0
> Dry: 0.8, 1.75, 4

Instruct

Mistral v7 Tekken

Creation Process

Creation process: Upscale > Pretrain > SFT > DPO

All training was qlora (including pretrain).

Pretrained on 177MB of data. Dataset consisteted mostly of Light Novels, NSFW stories, SFW stories and filled out with general corpus text from Huggingface FineWeb-2 dataset.

The model then went through SFT using a dataset of approx 3.6 million tokens, 700 RP conversations, 1000 creative writing / instruct samples and about 100 summaries. The bulk of this data has been made public.

Finally, DPO was used to make the model more consistent.

Downloads last month
0
Safetensors
Model size
33.6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zerofata/MS3.2-PaintedFantasy-Visage-33B

Datasets used to train zerofata/MS3.2-PaintedFantasy-Visage-33B

Collections including zerofata/MS3.2-PaintedFantasy-Visage-33B