ToastyPigeon commited on
Commit
eee7370
·
verified ·
1 Parent(s): 47c38d3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -0
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ This is [trashpanda-org/QwQ-32B-Snowdrop-v0](https://huggingface.co/trashpanda-org/QwQ-32B-Snowdrop-v0) with the `embed_tokens` and `lm_head` tensors replaced with the correctly-sized ones from [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct).
2
+
3
+ At the time of posting there's an ongoing issue where the Qwen2.5 embedding tensors have dimension `152064` (matching the vocab size stated in the config), but the actual tokenizer and vocab included have fewer tokens defined (seemingly Qwen pre-initialized extra embed space for future added tokens). Some LLM software (e.g. Axolotl, Mergekit) have this trigger an automated check and, seeing that the vocab size is less than the embed size, resize the embeddings to match, which breaks compatibility in some places.
4
+
5
+ (Why the instruct model and not QwQ? Because that's the tokenizer trashpanda was aiming for.)
6
+
7
+ ```python
8
+ from transformers import AutoModelForCausalLM, AutoTokenizer
9
+ import torch
10
+
11
+ # --- 1. Load Both Models ---
12
+ base_model_name = "Qwen/Qwen2.5-32B-Instruct"
13
+ finetuned_model_name = "trashpanda-org/QwQ-32B-Snowdrop-v0"
14
+
15
+ base_model = AutoModelForCausalLM.from_pretrained(base_model_name, torch_dtype=torch.bfloat16)
16
+ finetuned_model = AutoModelForCausalLM.from_pretrained(finetuned_model_name, torch_dtype=torch.bfloat16)
17
+
18
+ # --- 2. Get Embedding Layers and Resize fine-tuned model's embeddings---
19
+ base_embedding_layer = base_model.get_input_embeddings()
20
+ finetuned_model.resize_token_embeddings(base_embedding_layer.weight.size(0)) # Resize so copying works
21
+ finetuned_embedding_layer = finetuned_model.get_input_embeddings()
22
+
23
+ # --- 3. Replace Embedding Layer (The Core Operation) ---
24
+ with torch.no_grad(): # Very important: No gradient tracking during this operation!
25
+ finetuned_embedding_layer.weight.copy_(base_embedding_layer.weight)
26
+
27
+ print(finetuned_model.get_input_embeddings().weight.shape)
28
+
29
+ # --- 4. Save the Modified Base Model ---
30
+ output_dir = "QwQ-32B-Snowdrop-v0-EmbedFix"
31
+ base_tokenizer = AutoTokenizer.from_pretrained(base_model_name) # Get the tokenizer, too
32
+ finetuned_model.save_pretrained(output_dir)
33
+ base_tokenizer.save_pretrained(output_dir)
34
+
35
+ # --- 5. (Optional, but Recommended) Test ---
36
+ # Load and test the modified model
37
+ modified_base_model = AutoModelForCausalLM.from_pretrained(output_dir, torch_dtype=torch.bfloat16)
38
+ modified_base_tokenizer = AutoTokenizer.from_pretrained(output_dir)
39
+
40
+ test_text = "This is a test sentence."
41
+ inputs = modified_base_tokenizer(test_text, return_tensors="pt")
42
+ with torch.no_grad():
43
+ outputs = modified_base_model(**inputs) # Forward pass
44
+ print(outputs) # Success, no errors running the new model
45
+ ```