Qwen3-53B-A3B-2507-THINKING-TOTAL-RECALL-v2-MASTER-CODER-1m-dwq5-mlx

Set only as much context as you need.

If your app takes 15k context to draft, set it to 32k, and the model will focus on the task and the available timeline.

Increase the context as you move to production-ready code.

If you start with an 1M context it will plan accordingly :)

This model is a me/now personality, sharp wording, sometimes sassy.

This model Qwen3-53B-A3B-2507-THINKING-TOTAL-RECALL-v2-MASTER-CODER-1m-dwq5-mlx was converted to MLX format from DavidAU/Qwen3-53B-A3B-2507-THINKING-TOTAL-RECALL-v2-MASTER-CODER using mlx-lm version 0.26.1.

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("Qwen3-53B-A3B-2507-THINKING-TOTAL-RECALL-v2-MASTER-CODER-1m-dwq5-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Downloads last month: 97

Safetensors

Model size

53B params

Tensor type

BF16

U32

Model tree for nightmedia/Qwen3-53B-A3B-2507-THINKING-TOTAL-RECALL-v2-MASTER-CODER-1m-dwq5-mlx

Base model

Qwen/Qwen3-30B-A3B-Thinking-2507

Finetuned

DavidAU/Qwen3-53B-A3B-2507-THINKING-TOTAL-RECALL-v2-MASTER-CODER

Quantized

(6)

this model