Simply out of curiosity.

by 1TBGPU4EVR - opened 13 days ago

13 days ago

Why did you wall this MOE model considering you didn't Wall the other 50+ you've abliterated?
Thanks. Great models. I wish you kept the vision transformer in the VL models though :) Some of those would make killer inference agents.

huihui-ai

Owner 13 days ago

This is just an attempt with one idea, and we also tried other MoE models, but only Qwen3MoE performed the best.

huihui-ai

Owner 13 days ago

This can activate either one expert or multiple experts simultaneously, which is different from the activation method of Qwen3MoE. The parameters for simultaneous activation can be adjusted by referring to huihui-ai/Huihui-MoE-23B-A4B-abliterated

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment