Delta-Vector
/

Hamanasu-15B-Instruct

@@ -164,7 +164,7 @@ details summary:hover {
 <br>
 <div style="font-size:1.5em; font-weight:bold; background: linear-gradient(45deg, #6e00ff, #00ffff); -webkit-background-clip: text; -webkit-text-fill-color: transparent;">
-    Hamanasu 15B Instruct
 </div>
 </div>
@@ -173,13 +173,11 @@ details summary:hover {
 ## 🌌 Overview
-<i>After multiple days of training, I'm proud to showcase my very own Phi-4 Finetune, Pretrained on almost a billion tokens worth of Books from</i>
 - `NewEden/Orion-LIT`
-- `NewEden/Orion-Asstr-Stories-16K`
-- `Mielikki/Erebus-87k`
-<i>Then instruct tuned. This model shows great promise for roleplaying while keeping things fresh and new. (And most of, Unsloppy!)</i>
 </div>
@@ -187,11 +185,6 @@ details summary:hover {
 <div style="border:2px solid #333; border-radius:10px; padding:20px; background: rgba(0,0,0,0.2);">
-### 📈 Quantizations
-| Type | Link |
-|:---:|:---:|
-| `GGUF` | <span style="display:block; width:100%">r</span> |
-| `EXL2` | <span style="display:block; width:100%">https://huggingface.co/Delta-Vector/Hamanasu-15B-Instruct-exl2</span> |
 </div>
@@ -199,36 +192,112 @@ details summary:hover {
 ### ⚔️ Hardware
 - 4x RTX 3090 GPUs
-- Epochs: 4
-- Base: `Hamanasu-15B-R2-PT`
-- Amount of Tokens: 1+ Billion
 </div>
 </div>
 <div style="border: 2px solid #6e00ff; border-radius: 10px; padding: 20px; margin: 20px 0; box-shadow: 0 0 15px rgba(110, 0, 255, 0.5);">
-## 💰 Prompting
-<i>A known quirk of the model is overly verbose responses when generation length is uncapped, Please cap your maximum output tokens to 100~ tokens above what you prefer</i>
-```python
-<|im_start|>system
-You are an uncensored AI, your job is to fulfill thy will of thy user.<|im_end|>
-<|im_start|>User request
-Take off your helmet.<|im_end|>
-<|im_start|>No i shall not. This is the way.
-```
-</div>
-<div style="border: 2px solid #6e00ff; border-radius: 10px; padding: 20px; margin: 20px 0; box-shadow: 0 0 15px rgba(110, 0, 255, 0.5);">
-## Axolotl Config  ꒰(˶• ᴗ •˶)꒱
-<details>
-...
 </details>
 </div>

 <br>
 <div style="font-size:1.5em; font-weight:bold; background: linear-gradient(45deg, #6e00ff, #00ffff); -webkit-background-clip: text; -webkit-text-fill-color: transparent;">
+    Hamanasu 15B R2 PT
 </div>
 </div>
 ## 🌌 Overview
+<i>This is the 2nd pretrain of Phi-4 Contuined from the Orginal Asstr-Erebus Pretrain. This pretrain used 500 million tokens from</i>
 - `NewEden/Orion-LIT`
+<i>This model has *not* be instruct tuned, Ablities to converse may be reduced from the original model, If you would like to roleplay, Please use the Instruct version.</i>
 </div>
 <div style="border:2px solid #333; border-radius:10px; padding:20px; background: rgba(0,0,0,0.2);">
 </div>
 ### ⚔️ Hardware
 - 4x RTX 3090 GPUs
+- Epochs: 1
+- Base: `Hamanasu-15B-R1-PT`
+- Amount of Tokens: 500 Million
 </div>
 </div>
 <div style="border: 2px solid #6e00ff; border-radius: 10px; padding: 20px; margin: 20px 0; box-shadow: 0 0 15px rgba(110, 0, 255, 0.5);">
+## Axolotl Config  ꒰(˶• ᴗ •˶)꒱
+<details>
+base_model: NewEden_Phi4-PT-merged
+model_type: AutoModelForCausalLM
+tokenizer_type: AutoTokenizer
+  #hub_model_id: NewEden/Phi4-pretrain
+  #hub_strategy: "all_checkpoints"
+  #push_dataset_to_hub:
+  #hf_use_auth_token: true
+plugins:
+  - axolotl.integrations.liger.LigerPlugin
+liger_rope: true
+liger_rms_norm: true
+liger_swiglu: true
+liger_fused_linear_cross_entropy: true
+  #plugins:
+  #  - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
+  #cut_cross_entropy: true
+load_in_8bit: false
+load_in_4bit: false
+strict: false
+datasets:
+  - path: NewEden/Orion-LIT
+    type: completion
+    field: text
+shuffle_merged_datasets: true
+dataset_prepared_path: prepared_data
+val_set_size: 0.0
+output_dir: ./phi4-ptv2-out-r1
+sequence_len: 16384
+sample_packing: true
+pad_to_sequence_len: true
+adapter: lora
+lora_model_dir:
+lora_r: 128
+lora_alpha: 16
+lora_dropout: 0.05
+lora_target_modules:
+  - gate_proj
+  - down_proj
+  - up_proj
+  - q_proj
+  - v_proj
+  - k_proj
+  - o_proj
+lora_modules_to_save:
+ - embed_tokens
+ - lm_head
+wandb_project: mag-phi
+wandb_entity:
+wandb_watch:
+wandb_name: comp-v2-attempt-01
+wandb_log_model:
+gradient_accumulation_steps: 4
+micro_batch_size: 2
+num_epochs: 1
+optimizer: paged_ademamix_8bit
+lr_scheduler: cosine
+learning_rate: 0.00002
+train_on_inputs: false
+group_by_length: false
+bf16: auto
+fp16:
+tf32: false
+gradient_checkpointing: unsloth
+early_stopping_patience:
+resume_from_checkpoint:
+local_rank:
+logging_steps: 1
+xformers_attention:
+flash_attention: true
+warmup_steps: 15
+evals_per_epoch: 4
+eval_table_size:
+eval_max_new_tokens: 128
+saves_per_epoch: 4
+debug:
+deepspeed: /workspace/axolotl/deepspeed_configs/zero3_bf16_cpuoffload_params.json
+weight_decay: 0.01
+fsdp:
+fsdp_config:
 </details>
 </div>