Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ widget:
|
|
15 |
This model is compressed from the Mixtral-8x7B. Using Low-Rank Approximation, I removed 10 billion parameters from the MLP experts' matrices, enough to run the model on a single A100 80GB GPU using half precision.
|
16 |
|
17 |
|
18 |
-
|
19 |

|
20 |
|
21 |
|
|
|
15 |
This model is compressed from the Mixtral-8x7B. Using Low-Rank Approximation, I removed 10 billion parameters from the MLP experts' matrices, enough to run the model on a single A100 80GB GPU using half precision.
|
16 |
|
17 |
|
18 |
+
Without being retrained or fine-tuned, the model still retains its core performance:
|
19 |

|
20 |
|
21 |
|