jukofyork
/

Kimi-K2-Instruct-DRAFT-0.6B-v3.0

speculative-decoding

Model card Files Files and versions

jukofyork commited on Aug 10

Commit

0b70710

·

verified ·

1 Parent(s): 2729ffc

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -163,6 +163,8 @@ Operation completed successfully (ignore any 'segmentation fault' that follows!!
 **NOTE**: Due to the non-standard tokenizer, this needs the `--trust-remote-code` option.
 ## 2. The following datasets were used to create a fine-tuning dataset of ~2.3B tokens:
 - [agentlans/common-crawl-sample](https://huggingface.co/datasets/agentlans/common-crawl-sample)

 **NOTE**: Due to the non-standard tokenizer, this needs the `--trust-remote-code` option.
+**NOTE**: I had to manually delete `"pad_token_id": 163839` from `config.json` to get it to match the tokeniser when used in `llama.cpp` as a draft model.
 ## 2. The following datasets were used to create a fine-tuning dataset of ~2.3B tokens:
 - [agentlans/common-crawl-sample](https://huggingface.co/datasets/agentlans/common-crawl-sample)