OpenSeek-Fixed / README.md
Robertp423's picture
Upload 4 files
2d9cfed verified

OpenSeek-Small v1 Model Documentation

Overview

OpenSeek-Small-v1 is the initial production model of the OpenSeek project.

  • Utilizes DeepSeek-V3-like MoE architecture.
  • Comprises 1.4 billion total parameters, with 0.4 billion activated parameters.
  • Trained on 720 billion tokens.
  • Completely broken in stock form.

Key Fixes in this repository:

  • Fixed Broken Position Embeddings
  • Fixed Fundamental Incompatibilities Between Deepseekv3 Model and Qwen Tokenizer.

Usage Instructions

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Robertp423/OpenSeek-Fixed",trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("Robertp423/OpenSeek-Fixed",trust_remote_code=True)

inputs = tokenizer("The future of AI is", return_tensors="pt")
inputs.pop("token_type_ids", None)  # Critical fix
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))