π₯ Phoenix β Fast Reasoning Qwen3-32B
Model Name: Daemontatox/Phoenix
Developed by: Daemontatox
License: Apache-2.0
Base Model: unsloth/qwen3-32b
Training Stack: Unsloth + Huggingface TRL
β‘ What is Phoenix?
Phoenix is a finetuned Qwen3-32B model designed for rapid reasoning, low-token verbosity, and high-quality results. Ideal for chat agents, reasoning backends, and any application where speed and precision are critical.
β Key Features
- π 2Γ faster training with Unsloth
- β±οΈ Reduced token latency without compromising answer quality
- π― Tuned for instruction-following and reasoning clarity
- π§± Works with
transformers
,TGI
, andHugging Face Inference API
π§ͺ Inference Code (Transformers)
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "Daemontatox/Phoenix"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
prompt = "Explain the concept of emergence in complex systems in simple terms."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=150, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
π Inference via Hugging Face API
import requests
API_URL = "https://api-inference.huggingface.co/models/Daemontatox/Phoenix"
headers = {"Authorization": "Bearer YOUR_HF_API_TOKEN"}
data = {
"inputs": "Explain the concept of emergence in complex systems in simple terms.",
"parameters": {
"temperature": 0.7,
"max_new_tokens": 150
}
}
response = requests.post(API_URL, headers=headers, json=data) print(response.json()[0]["generated_text"])
β οΈ Replace YOUR_HF_API_TOKEN with your Hugging Face access token.
π§ Sample Output
Prompt:
"Explain the concept of emergence in complex systems in simple terms."
Output (Phoenix):
"Emergence is when many simple parts work together and create something more complex. For example, birds flying in a flock follow simple rules, but the group moves like one unit. That larger pattern 'emerges' from simple behavior."
π Known Limitations
Large VRAM required for local inference (~64GB+)
Not tuned for multilingual inputs
May not perform well on long-form CoT problems requiring step-wise thought
π Citation
@misc{daemontatox2025phoenix, title={Phoenix: Fast Reasoning Qwen3-32B Finetune}, author={Daemontatox}, year={2025}, note={Trained with Unsloth and Huggingface TRL}, url={https://huggingface.co/Daemontatox/Phoenix} }
- Downloads last month
- 11
Model tree for Daemontatox/Phoenix
Base model
Qwen/Qwen3-32B