YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

OpenSeek-Small v1 Model Documentation

Overview

OpenSeek-Small-v1 is the initial production model of the OpenSeek project.

  • Utilizes DeepSeek-V3-like MoE architecture.
  • Comprises 1.4 billion total parameters, with 0.4 billion activated parameters.
  • Trained on 720 billion tokens.
  • Completely broken in stock form.

Key Fixes in this repository:

  • Fixed Broken Position Embeddings
  • Fixed Fundamental Incompatibilities Between Deepseekv3 Model and Qwen Tokenizer.

Usage Instructions

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Robertp423/OpenSeek-Fixed",trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("Robertp423/OpenSeek-Fixed",trust_remote_code=True)

inputs = tokenizer("The future of AI is", return_tensors="pt")
inputs.pop("token_type_ids", None)  # Critical fix
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))
Downloads last month
16
Safetensors
Model size
1.59B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support