ToastyPigeon
/

QwQ-32B-Snowdrop-v0-EmbedFix

Model card Files Files and versions Community

ToastyPigeon commited on Mar 10

Commit

eee7370

·

verified ·

1 Parent(s): 47c38d3

Create README.md

Files changed (1) hide show

README.md +45 -0

README.md ADDED Viewed

	@@ -0,0 +1,45 @@

+This is [trashpanda-org/QwQ-32B-Snowdrop-v0](https://huggingface.co/trashpanda-org/QwQ-32B-Snowdrop-v0) with the `embed_tokens` and `lm_head` tensors replaced with the correctly-sized ones from [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct).
+At the time of posting there's an ongoing issue where the Qwen2.5 embedding tensors have dimension `152064` (matching the vocab size stated in the config), but the actual tokenizer and vocab included have fewer tokens defined (seemingly Qwen pre-initialized extra embed space for future added tokens). Some LLM software (e.g. Axolotl, Mergekit) have this trigger an automated check and, seeing that the vocab size is less than the embed size, resize the embeddings to match, which breaks compatibility in some places.
+(Why the instruct model and not QwQ? Because that's the tokenizer trashpanda was aiming for.)
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+# --- 1. Load Both Models ---
+base_model_name = "Qwen/Qwen2.5-32B-Instruct"
+finetuned_model_name = "trashpanda-org/QwQ-32B-Snowdrop-v0"
+base_model = AutoModelForCausalLM.from_pretrained(base_model_name, torch_dtype=torch.bfloat16)
+finetuned_model = AutoModelForCausalLM.from_pretrained(finetuned_model_name, torch_dtype=torch.bfloat16)
+# --- 2. Get Embedding Layers and Resize fine-tuned model's embeddings---
+base_embedding_layer = base_model.get_input_embeddings()
+finetuned_model.resize_token_embeddings(base_embedding_layer.weight.size(0)) # Resize so copying works
+finetuned_embedding_layer = finetuned_model.get_input_embeddings()
+# --- 3. Replace Embedding Layer (The Core Operation) ---
+with torch.no_grad():  # Very important: No gradient tracking during this operation!
+    finetuned_embedding_layer.weight.copy_(base_embedding_layer.weight)
+print(finetuned_model.get_input_embeddings().weight.shape)
+# --- 4. Save the Modified Base Model ---
+output_dir = "QwQ-32B-Snowdrop-v0-EmbedFix"
+base_tokenizer = AutoTokenizer.from_pretrained(base_model_name) # Get the tokenizer, too
+finetuned_model.save_pretrained(output_dir)
+base_tokenizer.save_pretrained(output_dir)
+# --- 5. (Optional, but Recommended) Test ---
+# Load and test the modified model
+modified_base_model = AutoModelForCausalLM.from_pretrained(output_dir, torch_dtype=torch.bfloat16)
+modified_base_tokenizer = AutoTokenizer.from_pretrained(output_dir)
+test_text = "This is a test sentence."
+inputs = modified_base_tokenizer(test_text, return_tensors="pt")
+with torch.no_grad():
+  outputs = modified_base_model(**inputs) # Forward pass
+print(outputs) # Success, no errors running the new model
+```