Delta-Vector commited on
Commit
f7778ff
·
verified ·
1 Parent(s): 7434438

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -26,7 +26,7 @@ tags:
26
 
27
  Didn't really have any cool README ideas for this so we're just going with just whatever song i'm listening to rn and it happened to be `Baby i'm bleeding`
28
 
29
- Nevertheless, This is a finetune from the 32K context extended (or fixed?) Arcee GLM4 base - Trained shrimply with just the Tulu-SFT-Mixture *but* I removed Safety alignment examples. Came out pretty well, It uses chatML due to the GLM4(and other formats like it, Such as Dan-chat) giving me a headache. It's a decently competant assistant although I haven't done any testing on how well the model performs at longer-contexts, nor have i done any RL afterwards to fix up it's edges.
30
 
31
  Think it should be a decent base for any future finetunes, I felt that GLM4 really wasn't given the proper time of day and it's a way better base then any Qwen3 model.
32
 
 
26
 
27
  Didn't really have any cool README ideas for this so we're just going with just whatever song i'm listening to rn and it happened to be `Baby i'm bleeding`
28
 
29
+ Nevertheless, This is a finetune from the 32K context extended (or fixed?) Arcee GLM4 base - Trained shrimply with just the Tulu-SFT-Mixture *but* I removed Safety alignment examples. Came out pretty well, It uses chatML due to the GLM4 Format giving me a headache. It's a decently competant assistant although I haven't done any testing on how well the model performs at longer-contexts, nor have i done any RL afterwards to fix up it's edges.
30
 
31
  Think it should be a decent base for any future finetunes, I felt that GLM4 really wasn't given the proper time of day and it's a way better base then any Qwen3 model.
32