Envoid commited on
Commit
68ec764
·
verified ·
1 Parent(s): 0aebf81

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -3
README.md CHANGED
@@ -1,3 +1,25 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ ---
4
+ # R1-Distill-Llama-8B-Anima10
5
+ ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/646a30454c1cd18b4976a3f6/H-UpGrG7SyGPm5wltA7zo.jpeg)
6
+
7
+ ## This model is a work in progress.
8
+
9
+ This model is the result of 10 epochs of finetuning [deepseek-ai/DeepSeek-R1-Distill-Llama-8B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-8B) on a private corpus containing 11 megabytes of hand-selected raw text at a low learning rate using short token sequences.
10
+
11
+ The original intention was to try and influence the style of the model's thinking text but it seems to have lead to other unintended results.
12
+
13
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/646a30454c1cd18b4976a3f6/PUZLozH89ug9pt_iaadSu.png)
14
+
15
+ It was originally trained for 3 epochs.
16
+
17
+ In testing when it was asked "What is the fastest way to get around Europe?" it fell into an endless trap of recursive (but relevant) thinking.
18
+
19
+ Also noteworthy was the slow descent of the training loss once it reached around 3.5.
20
+
21
+ In order to further explore these observations an additional 7 epochs of training was scheduled and this model is the result of that.
22
+
23
+ It was not only able to resolve the thinking loop regarding the Europe question but has broken past some of the 'hard stops' originally trained into it.
24
+
25
+ The model is currently undergoing additional training.