license: apache-2.0 | |
tags: | |
- merge | |
- mergekit | |
- lazymergekit | |
- Casual-Autopsy/L3-bluuwhale-SAO-MIX-8B-V1_fp32-merge-calc | |
# ZeroXClem/Llama-3-8B-ProLong-SAO-Roleplay-512 | |
ZeroXClem/Llama-3-8B-ProLong-SAO-Roleplay-512 is a merge of the following models using [mergekit](https://github.com/cg123/mergekit): | |
* [Casual-Autopsy/L3-bluuwhale-SAO-MIX-8B-V1_fp32-merge-calc](https://huggingface.co/Casual-Autopsy/L3-bluuwhale-SAO-MIX-8B-V1_fp32-merge-calc) | |
## 🧩 Configuration | |
```yaml | |
models: | |
- model: princeton-nlp/Llama-3-8B-ProLong-512k-Instruct | |
# Base model: no additional parameters necessary | |
- model: Casual-Autopsy/L3-bluuwhale-SAO-MIX-8B-V1_fp32-merge-calc | |
parameters: | |
weight: 0.5 # Adjusts influence of roleplay features from L3-bluuwhale | |
density: 0.6 # Preserves around 60% of significant parameters from the roleplay model | |
merge_method: della | |
base_model: princeton-nlp/Llama-3-8B-ProLong-512k-Instruct | |
parameters: | |
epsilon: 0.05 # Fine-tunes the granularity of pruning | |
lambda: 1.0 # Scaling factor to harmonize parameter influence | |
normalize: true # Ensures parameters align without large deviations | |
int8_mask: true # Uses an efficient format to handle larger context | |
dtype: float32 | |
out_dtype: bfloat16 # Output type to balance precision and efficiency | |
``` |