--- base_model: SicariusSicariiStuff/Impish_Nemo_12B library_name: peft pipeline_tag: text-generation tags: - axolotl - dpo - transformers datasets: - jondurbin/gutenberg-dpo-v0.1 - nbeerbower/gutenberg2-dpo - nbeerbower/gutenberg-moderne-dpo - sam-paech/gutenberg3-generalfiction-scifi-fantasy-romance-adventure-dpo license: apache-2.0 language: - en --- # Model Card for Model ID A DPO qLORA finetune of Mistral Nemo 12b on four Gutenberg datasets, approx ~6.3k lines. ## Model Details ### Model Description Finetuned for 1 epoch on an A100 through Vast.AI. ## Credits Thank you to Axolotl for making finetuning easier. Thank you to Docker for... existing, I guess. ## YML Configuration ``` base_model: SicariusSicariiStuff/Impish_Nemo_12B load_in_8bit: false load_in_4bit: true adapter: qlora gradient_accumulation_steps: 8 micro_batch_size: 2 num_epochs: 1 optimizer: adamw_bnb_8bit lr_scheduler: cosine learning_rate: 0.00001 sequence_len: 4096 lora_r: 16 lora_alpha: 32 lora_dropout: 0.05 lora_target_linear: true bf16: true tf32: false gradient_checkpointing: true gradient_checkpointing_kwargs: use_reentrant: false logging_steps: 1 flash_attention: true loss_watchdog_threshold: 5.0 loss_watchdog_patience: 3 rl: dpo datasets: - path: sam-paech/gutenberg3-generalfiction-scifi-fantasy-romance-adventure-dpo split: train type: chatml.prompt_pairs - path: nbeerbower/gutenberg-moderne-dpo split: train type: chatml.prompt_pairs - path: nbeerbower/gutenberg2-dpo split: train type: chatml.prompt_pairs - path: jondurbin/gutenberg-dpo-v0.1 split: train type: chatml.prompt_pairs dataset_prepared_path: last_run_prepared val_set_size: 0.1 output_dir: ./outputs/lora-out ```