File size: 1,737 Bytes
bedc451
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d4cc82f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
---
base_model: SicariusSicariiStuff/Impish_Nemo_12B
library_name: peft
pipeline_tag: text-generation
tags:
- axolotl
- dpo
- transformers
datasets:
- jondurbin/gutenberg-dpo-v0.1
- nbeerbower/gutenberg2-dpo
- nbeerbower/gutenberg-moderne-dpo
- sam-paech/gutenberg3-generalfiction-scifi-fantasy-romance-adventure-dpo
license: apache-2.0
language:
- en
---

# Model Card for Model ID

A DPO qLORA finetune of Mistral Nemo 12b on four Gutenberg datasets, approx ~6.3k lines.


## Model Details

### Model Description

Finetuned for 1 epoch on an A100 through Vast.AI.

## Credits

Thank you to Axolotl for making finetuning easier. Thank you to Docker for... existing, I guess.

## YML Configuration

```
base_model: SicariusSicariiStuff/Impish_Nemo_12B

load_in_8bit: false
load_in_4bit: true
adapter: qlora

gradient_accumulation_steps: 8
micro_batch_size: 2
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.00001

sequence_len: 4096

lora_r: 16
lora_alpha: 32
lora_dropout: 0.05
lora_target_linear: true

bf16: true
tf32: false

gradient_checkpointing: true
gradient_checkpointing_kwargs:
  use_reentrant: false
logging_steps: 1
flash_attention: true

loss_watchdog_threshold: 5.0
loss_watchdog_patience: 3

rl: dpo
datasets:
  - path: sam-paech/gutenberg3-generalfiction-scifi-fantasy-romance-adventure-dpo
    split: train
    type: chatml.prompt_pairs
  - path: nbeerbower/gutenberg-moderne-dpo
    split: train
    type: chatml.prompt_pairs
  - path: nbeerbower/gutenberg2-dpo
    split: train
    type: chatml.prompt_pairs
  - path: jondurbin/gutenberg-dpo-v0.1
    split: train
    type: chatml.prompt_pairs
dataset_prepared_path: last_run_prepared
val_set_size: 0.1
output_dir: ./outputs/lora-out
```