Safetensors
qwen2
draft
speculative-decoding

What's new in 3.0 vs 2.0?

#1
by usrlocalben - opened

The v2.0 data/readme is gone, so one can't compare them to see what changed. At a glance, the modelcard looks similar.

Sorry, I was getting really confused trying to maintain all the old versions.

The only difference is that I used around 1/3 instruction data:

### The following datasets were used to create a fine-tuning dataset of ~2.3B tokens:

- agentlans/common-crawl-sample
- bigcode/the-stack-smol-xl
- rombodawg/Everything_Instruct (NOTE: output field only)

instead of 1/2 common-crawl and 1/2 the-stack data.

From experimenting, this helps the model not zone in too much on coding tasks and get a reasonable acceptance rate for normal / non-coding use.

Understood, Thanks πŸ‘

Sign up or log in to comment