image

πŸ”₯ Phoenix β€” Fast Reasoning Qwen3-32B

Model Name: Daemontatox/Phoenix
Developed by: Daemontatox
License: Apache-2.0
Base Model: unsloth/qwen3-32b
Training Stack: Unsloth + Huggingface TRL


⚑ What is Phoenix?

Phoenix is a finetuned Qwen3-32B model designed for rapid reasoning, low-token verbosity, and high-quality results. Ideal for chat agents, reasoning backends, and any application where speed and precision are critical.


βœ… Key Features

  • πŸ” 2Γ— faster training with Unsloth
  • ⏱️ Reduced token latency without compromising answer quality
  • 🎯 Tuned for instruction-following and reasoning clarity
  • 🧱 Works with transformers, TGI, and Hugging Face Inference API

πŸ§ͺ Inference Code (Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "Daemontatox/Phoenix"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

prompt = "Explain the concept of emergence in complex systems in simple terms."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.7)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🌐 Inference via Hugging Face API

import requests

API_URL = "https://api-inference.huggingface.co/models/Daemontatox/Phoenix"
headers = {"Authorization": "Bearer YOUR_HF_API_TOKEN"}

data = {
  "inputs": "Explain the concept of emergence in complex systems in simple terms.",
  "parameters": {
    "temperature": 0.7,
    "max_new_tokens": 150
  }
}

response = requests.post(API_URL, headers=headers, json=data) print(response.json()[0]["generated_text"])

⚠️ Replace YOUR_HF_API_TOKEN with your Hugging Face access token.


🧠 Sample Output

Prompt:

"Explain the concept of emergence in complex systems in simple terms."

Output (Phoenix):

"Emergence is when many simple parts work together and create something more complex. For example, birds flying in a flock follow simple rules, but the group moves like one unit. That larger pattern 'emerges' from simple behavior."


πŸ“‰ Known Limitations

Large VRAM required for local inference (~64GB+)

Not tuned for multilingual inputs

May not perform well on long-form CoT problems requiring step-wise thought


πŸ“„ Citation

@misc{daemontatox2025phoenix, title={Phoenix: Fast Reasoning Qwen3-32B Finetune}, author={Daemontatox}, year={2025}, note={Trained with Unsloth and Huggingface TRL}, url={https://huggingface.co/Daemontatox/Phoenix} }


Downloads last month
11
Safetensors
Model size
32.8B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Daemontatox/Phoenix

Base model

Qwen/Qwen3-32B
Finetuned
(39)
this model