Model Details

Model Developers: SeungJin Lee (knlpscience)

Base Model: upstage/SOLAR-10.7B-v1.0

Notice

hyper params I

-batch_size : 16

-num_epochs : 1

-micro_batch : 1

-gradient_accumulation_steps : batch_size // micro_batch

hyper params II

-cutoff_len : 4096

-lr_scheduler : 'cosine'

-warmup_ratio : 0.06

-learning_rate : 4e-4

-optimizer : 'adamw_torch'

-weight_decay : 0.01

-max_grad_norm : 1.0

LoRA config

-lora_r : 64

-lora_alpha : 16

-lora_dropout : 0.05

-lora_target_modules : ["gate_proj", "down_proj", "up_proj"]

Downloads last month
103
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for knlp/KS-SOLAR-10.7B-v0.1

Quantizations
3 models

Dataset used to train knlp/KS-SOLAR-10.7B-v0.1