jayzou3773/qwen3_5-moe-expert_drop-pure_gradient_pruning-r128-s1k-128samples-thinking 19B • Updated Apr 29 • 7
jayzou3773/qwen3_5-moe-expert_drop-pure_expert_gradient_pruning-r128-s1k-128samples-thinking 19B • Updated Apr 29 • 7
jayzou3773/qwen3_5-moe-expert_drop-layerwise_pruning-r128-s1k-128samples-thinking 19B • Updated Apr 29 • 6
jayzou3773/qwen3_5-moe-expert_drop-bias_pruning-r128-s1k-128samples-thinking 19B • Updated Apr 29 • 868
jayzou3773/qwen3-moe-expert_drop-pure_gradient_pruning-r64-s1k-128samples-thinking 16B • Updated Apr 28 • 66
jayzou3773/qwen3-moe-expert_drop-pure_expert_gradient_pruning-r64-s1k-128samples-thinking 16B • Updated Apr 28 • 97
jayzou3773/qwen3-moe-expert_drop-layerwise_pruning-r64-s1k-128samples-thinking 16B • Updated Apr 28 • 110
jayzou3773/qwen3_5-moe-expert_drop-weight_magnitude_pruning-r128-s1k-128samples 19B • Updated Apr 25 • 9
jayzou3773/qwen3_5-moe-expert_drop-pure_gradient_pruning-r128-s1k-128samples 19B • Updated Apr 25 • 8
jayzou3773/qwen3_5-moe-expert_drop-pure_expert_gradient_pruning-r128-s1k-128samples 19B • Updated Apr 25 • 39
jayzou3773/qwen3-moe-expert_drop-weight_magnitude_pruning-r64-s1k-128samples 16B • Updated Apr 25 • 23
jayzou3773/qwen3-moe-expert_drop-pure_expert_gradient_pruning-r64-s1k-128samples 16B • Updated Apr 25 • 25