🔥 Phoenix — Fast Reasoning Qwen3-32B

Model Name: Daemontatox/Phoenix
Developed by: Daemontatox
License: Apache-2.0
Base Model: unsloth/qwen3-32b
Training Stack: Unsloth + Huggingface TRL

⚡ What is Phoenix?

Phoenix is a finetuned Qwen3-32B model designed for rapid reasoning, low-token verbosity, and high-quality results. Ideal for chat agents, reasoning backends, and any application where speed and precision are critical.

✅ Key Features

🔁 2× faster training with Unsloth
⏱️ Reduced token latency without compromising answer quality
🎯 Tuned for instruction-following and reasoning clarity
🧱 Works with transformers, TGI, and Hugging Face Inference API

🧪 Inference Code (Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_name = "Daemontatox/Phoenix"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

prompt = "Explain the concept of emergence in complex systems in simple terms."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.7)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

🌐 Inference via Hugging Face API

import requests

API_URL = "https://api-inference.huggingface.co/models/Daemontatox/Phoenix"
headers = {"Authorization": "Bearer YOUR_HF_API_TOKEN"}

data = {
  "inputs": "Explain the concept of emergence in complex systems in simple terms.",
  "parameters": {
    "temperature": 0.7,
    "max_new_tokens": 150
  }
}

response = requests.post(API_URL, headers=headers, json=data) print(response.json()[0]["generated_text"])

⚠️ Replace YOUR_HF_API_TOKEN with your Hugging Face access token.

🧠 Sample Output

Prompt:

"Explain the concept of emergence in complex systems in simple terms."

Output (Phoenix):

"Emergence is when many simple parts work together and create something more complex. For example, birds flying in a flock follow simple rules, but the group moves like one unit. That larger pattern 'emerges' from simple behavior."

📉 Known Limitations

Large VRAM required for local inference (~64GB+)

Not tuned for multilingual inputs

May not perform well on long-form CoT problems requiring step-wise thought

📄 Citation

@misc{daemontatox2025phoenix, title={Phoenix: Fast Reasoning Qwen3-32B Finetune}, author={Daemontatox}, year={2025}, note={Trained with Unsloth and Huggingface TRL}, url={https://huggingface.co/Daemontatox/Phoenix} }

Daemontatox
/

Phoenix

🔥 Phoenix — Fast Reasoning Qwen3-32B

⚡ What is Phoenix?

✅ Key Features

🧪 Inference Code (Transformers)

Model tree for Daemontatox/Phoenix