
halley-ai/Qwen3-Next-80B-A3B-Instruct-MLX-5bit-gs32
Text Generation
â˘
80B
â˘
Updated
â˘
46
â˘
1
Text Generation & Chat Assistants; Model Compression & Quantization (Q4/Q6/Q8, gs32); Inference & Serving (on-prem, low-latency); RAG / Retrieval; Agents & Tool Use; Distillation / LoRA / Fine-tuning