Zen-Next (80B)
Part of the Zen AI Model Family
Model Description
Parameters: 80B
Base Model: Qwen/Qwen2.5-72B-Instruct
Specialization: Complex reasoning & extended context
Training: Flagship training with constitutional AI
Context: 32K-128K tokens
Thinking: Up to 1,000,000 tokens
Files in This Repository
This repository contains ALL formats and quantizations:
π· SafeTensors (Original)
model.safetensors
- Full precision weightsconfig.json
- Model configurationtokenizer.json
- Fast tokenizer
π’ GGUF Quantized
zen-next-80b-instruct-Q4_K_M.gguf
- 4-bit (recommended)zen-next-80b-instruct-Q5_K_M.gguf
- 5-bit (balanced)zen-next-80b-instruct-Q8_0.gguf
- 8-bit (high quality)
π MLX (Apple Silicon)
mlx-4bit/
- 4-bit quantized for M-seriesmlx-8bit/
- 8-bit quantized for M-series
Performance
Benchmark | Score | Rank |
---|---|---|
MMLU | 75.6% | Top 10% |
GSM8K | 82.1% | Top 15% |
HumanEval | 61.7% | Top 20% |
Quick Start
Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("zenlm/zen-next-80b-instruct")
tokenizer = AutoTokenizer.from_pretrained("zenlm/zen-next-80b-instruct")
# With thinking mode
messages = [{"role": "user", "content": "Your question here"}]
text = tokenizer.apply_chat_template(messages, enable_thinking=True)
GGUF with llama.cpp
./main -m zen-next-80b-instruct-Q4_K_M.gguf -p "Your prompt" -n 512
MLX for Apple Silicon
from mlx_lm import load, generate
model, tokenizer = load("zenlm/zen-next-80b-instruct")
response = generate(model, tokenizer, "Your prompt", max_tokens=200)
Unique Training Background
Flagship training with constitutional AI
This model was specifically optimized for complex reasoning & extended context with careful attention to:
- Inference efficiency
- Memory footprint
- Quality preservation
- Thinking capabilities
Part of the Zen Family β’ Collection β’ GitHub
- Downloads last month
- 7