What's new in 3.0 vs 2.0?
#1
by
usrlocalben
- opened
The v2.0 data/readme is gone, so one can't compare them to see what changed. At a glance, the modelcard looks similar.
Sorry, I was getting really confused trying to maintain all the old versions.
The only difference is that I used around 1/3 instruction data:
### The following datasets were used to create a fine-tuning dataset of ~2.3B tokens:
- agentlans/common-crawl-sample
- bigcode/the-stack-smol-xl
- rombodawg/Everything_Instruct (NOTE: output field only)
instead of 1/2 common-crawl
and 1/2 the-stack
data.
From experimenting, this helps the model not zone in too much on coding tasks and get a reasonable acceptance rate for normal / non-coding use.
Understood, Thanks π