|
--- |
|
base_model: |
|
- UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3 |
|
- ZeusLabs/L3-Aethora-15B-V2 |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
- llama |
|
--- |
|
|
|
Semi-Healed Llama-3 15B. Programming, Scientific Q&A, General Instruct |
|
|
|
--------------------------------------------------------------------- |
|
|
|
# Llama-3-Instruct-15B-SPPO-Iter3-SH-F32 |
|
|
|
Fully functional upscaled version of [Llama-3-Instruct-8B-SPPO-Iter3](https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3) to 15B parameters with projection swap. |
|
|
|
Self-Play Preference Optimization for Language Model Alignment (https://arxiv.org/abs/2405.00675) |
|
|
|
--------------------------------------------------------------------- |
|
|
|
# Quants |
|
* [GGUF Q5_K_M](https://huggingface.co/v000000/Llama-3-Instruct-15B-SPPO-Iter3-SH-Q5_K_M-GGUF) |
|
|
|
## merge |
|
|
|
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). |
|
|
|
## Merge Details |
|
### Merge Method |
|
|
|
This model was merged using the Passthrough and SLERP merge method. |
|
|
|
### Models Merged |
|
|
|
The following models were included in the merge: |
|
* [UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3](https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3) |
|
* [ZeusLabs/L3-Aethora-15B-V2](https://huggingface.co/ZeusLabs/L3-Aethora-15B-V2) |
|
* [grimjim/Llama-3-Instruct-abliteration-LoRA-8B](https://huggingface.co/grimjim/Llama-3-Instruct-abliteration-LoRA-8B) |
|
|
|
### Configuration |
|
|
|
The following YAML configuration was used to produce this model: |
|
|
|
```yaml |
|
#1. |
|
|
|
dtype: float32 |
|
merge_method: passthrough |
|
slices: |
|
- sources: |
|
- layer_range: [0, 24] |
|
model: UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3+grimjim/Llama-3-Instruct-abliteration-LoRA-8B |
|
- sources: |
|
- layer_range: [8, 24] |
|
model: UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3+grimjim/Llama-3-Instruct-abliteration-LoRA-8B |
|
parameters: |
|
- sources: |
|
- layer_range: [8, 24] |
|
model: UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3+grimjim/Llama-3-Instruct-abliteration-LoRA-8B |
|
- sources: |
|
- layer_range: [24, 32] |
|
model: UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3+grimjim/Llama-3-Instruct-abliteration-LoRA-8B |
|
|
|
#2. |
|
|
|
models: |
|
- model: ./Llama-3-Instruct-15B-SPPO-Iter3 |
|
merge_method: slerp |
|
base_model: ZeusLabs/L3-Aethora-15B-V2 |
|
parameters: |
|
t: |
|
- filter: o_proj |
|
value: 0 #take finetuned from Aethora |
|
- filter: down_proj |
|
value: 0 #take finetuned from Aethora |
|
- value: 1 #rest of tensors SPPO |
|
dtype: float32 |
|
|
|
``` |
|
|
|
uncensored=no |
|
|
|
# Prompt Template(Llama-3-Instruct) |
|
```bash |
|
<|begin_of_text|><|start_header_id|>system<|end_header_id|> |
|
|
|
{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|> |
|
|
|
{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|> |
|
|
|
{output}<|eot_id|> |
|
|
|
``` |