Retain older readme. It's the same but with a fixed o_proj. So no need to update readme.
Browse files
README.md
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
---
|
2 |
base_model:
|
3 |
-
- Qwen/Qwen3-30B-A3B
|
4 |
- Qwen/Qwen3-30B-A3B-Base
|
|
|
5 |
library_name: transformers
|
6 |
tags:
|
7 |
- mergekit
|
@@ -10,6 +10,11 @@ tags:
|
|
10 |
---
|
11 |
# CavesOfQwen
|
12 |
|
|
|
|
|
|
|
|
|
|
|
13 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
14 |
|
15 |
## Merge Details
|
@@ -42,4 +47,4 @@ base_model: Qwen/Qwen3-30B-A3B
|
|
42 |
parameters:
|
43 |
normalize: true
|
44 |
dtype: bfloat16
|
45 |
-
```
|
|
|
1 |
---
|
2 |
base_model:
|
|
|
3 |
- Qwen/Qwen3-30B-A3B-Base
|
4 |
+
- Qwen/Qwen3-30B-A3B
|
5 |
library_name: transformers
|
6 |
tags:
|
7 |
- mergekit
|
|
|
10 |
---
|
11 |
# CavesOfQwen
|
12 |
|
13 |
+
> Hey Hey, Model Gang, KaraWitch Here.
|
14 |
+
> Have you, ever merged too deeply.
|
15 |
+
> And found something 'they' don't want you to know?
|
16 |
+
> "[MoQ3](https://youtu.be/o_PBfLbd3zw)", who is she? And why can't I reach her?
|
17 |
+
|
18 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
19 |
|
20 |
## Merge Details
|
|
|
47 |
parameters:
|
48 |
normalize: true
|
49 |
dtype: bfloat16
|
50 |
+
```
|