Zen Omni 30B Thinking

Advanced multimodal model with thinking and audio capabilities from the Zen family.

Model Details

  • Architecture: Qwen2-based with multimodal extensions
  • Parameters: 31.7B
  • Context Length: 32,768 tokens
  • Modalities: Text, Audio, Thinking
  • Hidden Size: 5,120
  • Layers: 64
  • Attention Heads: 40
  • Developer: Hanzo AI

Features

  • Thinking Module: Chain-of-thought reasoning capabilities
  • Audio Tower: Audio processing and understanding
  • Multimodal Integration: Seamless text and audio processing

Usage

PyTorch

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("zenlm/zen-omni-30b-thinking")
tokenizer = AutoTokenizer.from_pretrained("zenlm/zen-omni-30b-thinking")

# Generate text
prompt = "Explain quantum computing"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Available Formats

  • PyTorch: Default safetensors format (16 shards)
  • GGUF: Coming soon
  • MLX: Coming soon

Hardware Requirements

  • VRAM: ~64GB for full precision
  • RAM: 128GB recommended
  • Storage: ~60GB for model files

Training

Fine-tuned with:

  • Zen identity
  • Multimodal understanding
  • Chain-of-thought reasoning
  • Audio processing capabilities

Model Components

  • thinker.*: Thinking/reasoning module
  • audio_tower.*: Audio processing layers
  • Standard transformer layers for text generation

License

Apache 2.0

Downloads last month
10
Safetensors
Model size
31.7B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support