maldv commited on
Commit
188b56b
·
verified ·
1 Parent(s): b929507

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -69
README.md CHANGED
@@ -1,69 +1,71 @@
1
- ---
2
- library_name: transformers
3
- tags:
4
- - llama-3
5
- license: cc-by-nc-4.0
6
- ---
7
-
8
- # Spring Chicken 8x8b
9
-
10
- I've been really impressed with how well these frankenmoe models quant compared to the base llama 8b, but with far better speed than the 70b. There have been some great 4x8b models released recently, so I tried an 8x8b.
11
-
12
- ```
13
- base_model: ./maldv/spring
14
- gate_mode: hidden
15
- dtype: bfloat16
16
- experts_per_token: 2
17
- experts:
18
- - source_model: ./models/Llama3-ChatQA-1.5-8B
19
- positive_prompts:
20
- - 'add numbers'
21
- - 'solve for x'
22
- negative_prompts:
23
- - 'I love you'
24
- - 'Help me'
25
- - source_model: ./models/InfinityRP-v2-8B
26
- positive_prompts:
27
- - 'they said'
28
- - source_model: ./models/Einstein-v6.1-Llama3-8B
29
- positive_prompts:
30
- - 'the speed of light'
31
- - 'chemical reaction'
32
- - source_model: ./models/Llama-3-Soliloquy-8B-v2
33
- positive_prompts:
34
- - 'write a'
35
- - source_model: ./models/Llama-3-Lumimaid-8B-v0.1
36
- positive_prompts:
37
- - 'she looked'
38
- - source_model: ./models/L3-TheSpice-8b-v0.8.3
39
- positive_prompts:
40
- - 'they felt'
41
- - source_model: ./models/Llama3-OpenBioLLM-8B
42
- positive_prompts:
43
- - 'the correct treatment'
44
- - source_model: ./models/Llama-3-SauerkrautLM-8b-Instruct
45
- positive_prompts:
46
- - 'help me'
47
- - 'should i'
48
- ```
49
-
50
- ### Spring
51
-
52
- Spring is a cascading dare-ties merge of the following models:
53
-
54
- ```python
55
- [
56
- 'Einstein-v6.1-Llama3-8B',
57
- 'L3-TheSpice-8b-v0.8.3',
58
- 'Configurable-Hermes-2-Pro-Llama-3-8B',
59
- 'Llama3-ChatQA-1.5-8B',
60
- 'Llama3-OpenBioLLM-8B',
61
- 'InfinityRP-v2-8B',
62
- 'Llama-3-Soliloquy-8B-v2',
63
- 'Tiamat-8b-1.2-Llama-3-DPO',
64
- 'Llama-3-8B-Instruct-Gradient-1048k',
65
- 'Llama-3-Lumimaid-8B-v0.1',
66
- 'Llama-3-SauerkrautLM-8b-Instruct',
67
- 'Meta-Llama-3-8B-Instruct-DPO',
68
- ]
69
- ```
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ tags:
4
+ - llama-3
5
+ license: cc-by-nc-4.0
6
+ ---
7
+
8
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65b19c1b098c85365af5a83e/kQpfZwQ2tmpUhHx7E7jFF.png)
9
+
10
+ # Spring Chicken 8x8b
11
+
12
+ I've been really impressed with how well these frankenmoe models quant compared to the base llama 8b, but with far better speed than the 70b. There have been some great 4x8b models released recently, so I tried an 8x8b.
13
+
14
+ ```
15
+ base_model: ./maldv/spring
16
+ gate_mode: hidden
17
+ dtype: bfloat16
18
+ experts_per_token: 2
19
+ experts:
20
+ - source_model: ./models/Llama3-ChatQA-1.5-8B
21
+ positive_prompts:
22
+ - 'add numbers'
23
+ - 'solve for x'
24
+ negative_prompts:
25
+ - 'I love you'
26
+ - 'Help me'
27
+ - source_model: ./models/InfinityRP-v2-8B
28
+ positive_prompts:
29
+ - 'they said'
30
+ - source_model: ./models/Einstein-v6.1-Llama3-8B
31
+ positive_prompts:
32
+ - 'the speed of light'
33
+ - 'chemical reaction'
34
+ - source_model: ./models/Llama-3-Soliloquy-8B-v2
35
+ positive_prompts:
36
+ - 'write a'
37
+ - source_model: ./models/Llama-3-Lumimaid-8B-v0.1
38
+ positive_prompts:
39
+ - 'she looked'
40
+ - source_model: ./models/L3-TheSpice-8b-v0.8.3
41
+ positive_prompts:
42
+ - 'they felt'
43
+ - source_model: ./models/Llama3-OpenBioLLM-8B
44
+ positive_prompts:
45
+ - 'the correct treatment'
46
+ - source_model: ./models/Llama-3-SauerkrautLM-8b-Instruct
47
+ positive_prompts:
48
+ - 'help me'
49
+ - 'should i'
50
+ ```
51
+
52
+ ### Spring
53
+
54
+ Spring is a cascading dare-ties merge of the following models:
55
+
56
+ ```python
57
+ [
58
+ 'Einstein-v6.1-Llama3-8B',
59
+ 'L3-TheSpice-8b-v0.8.3',
60
+ 'Configurable-Hermes-2-Pro-Llama-3-8B',
61
+ 'Llama3-ChatQA-1.5-8B',
62
+ 'Llama3-OpenBioLLM-8B',
63
+ 'InfinityRP-v2-8B',
64
+ 'Llama-3-Soliloquy-8B-v2',
65
+ 'Tiamat-8b-1.2-Llama-3-DPO',
66
+ 'Llama-3-8B-Instruct-Gradient-1048k',
67
+ 'Llama-3-Lumimaid-8B-v0.1',
68
+ 'Llama-3-SauerkrautLM-8b-Instruct',
69
+ 'Meta-Llama-3-8B-Instruct-DPO',
70
+ ]
71
+ ```