File size: 2,685 Bytes
2b7f9ef
 
bae2f8a
2b7f9ef
 
 
 
 
4f927bc
2b7f9ef
4f927bc
9015acc
4f927bc
 
 
40091be
4f927bc
77dad4a
2ab598b
829e357
 
fa2c6d6
 
bc4f4b1
 
 
4bf2b24
2b7f9ef
 
 
 
 
 
d80c8fb
2b7f9ef
 
 
 
d347d0a
2b7f9ef
d347d0a
2b7f9ef
 
 
 
 
 
a5902dd
 
4f927bc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a5902dd
 
2b7f9ef
4f927bc
2b7f9ef
 
 
 
 
135fac6
2b7f9ef
135fac6
a5902dd
2b7f9ef
a5902dd
2b7f9ef
4f927bc
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
---
base_model:
- UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3
- ZeusLabs/L3-Aethora-15B-V2
library_name: transformers
tags:
- mergekit
- merge
- llama
---

Semi-Healed Llama-3 15B. Programming, Scientific Q&A, General Instruct

---------------------------------------------------------------------

# Llama-3-Instruct-15B-SPPO-Iter3-SH-F32

Fully functional upscaled version of [Llama-3-Instruct-8B-SPPO-Iter3](https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3) to 15B parameters with projection swap.

Self-Play Preference Optimization for Language Model Alignment (https://arxiv.org/abs/2405.00675)

---------------------------------------------------------------------

# Quants
* [GGUF Q5_K_M](https://huggingface.co/v000000/Llama-3-Instruct-15B-SPPO-Iter3-SH-Q5_K_M-GGUF)

## merge

This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

## Merge Details
### Merge Method

This model was merged using the Passthrough and SLERP merge method.

### Models Merged

The following models were included in the merge:
* [UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3](https://huggingface.co/UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3)
* [ZeusLabs/L3-Aethora-15B-V2](https://huggingface.co/ZeusLabs/L3-Aethora-15B-V2)
* [grimjim/Llama-3-Instruct-abliteration-LoRA-8B](https://huggingface.co/grimjim/Llama-3-Instruct-abliteration-LoRA-8B)

### Configuration

The following YAML configuration was used to produce this model:

```yaml
#1.

dtype: float32
merge_method: passthrough
slices:
- sources:
  - layer_range: [0, 24]
    model: UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3+grimjim/Llama-3-Instruct-abliteration-LoRA-8B
- sources:
  - layer_range: [8, 24]
    model: UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3+grimjim/Llama-3-Instruct-abliteration-LoRA-8B
    parameters:
- sources:
  - layer_range: [8, 24]
    model: UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3+grimjim/Llama-3-Instruct-abliteration-LoRA-8B
- sources:
  - layer_range: [24, 32]
    model: UCLA-AGI/Llama-3-Instruct-8B-SPPO-Iter3+grimjim/Llama-3-Instruct-abliteration-LoRA-8B

#2.

models:
  - model: ./Llama-3-Instruct-15B-SPPO-Iter3
merge_method: slerp
base_model: ZeusLabs/L3-Aethora-15B-V2
parameters:
  t:
    - filter: o_proj
      value: 0 #take finetuned from Aethora
    - filter: down_proj
      value: 0 #take finetuned from Aethora
    - value: 1 #rest of tensors SPPO
dtype: float32

```

uncensored=no

# Prompt Template(Llama-3-Instruct)
```bash
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>

{input}<|eot_id|><|start_header_id|>assistant<|end_header_id|>

{output}<|eot_id|>

```