RESMP-DEV
/

GLM-4.6-NVFP4

Text Generation

compressed-tensors

Model card Files Files and versions

Kearm commited on 23 days ago

Commit

34396d5

·

verified ·

1 Parent(s): 4b0731c

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -24,9 +24,9 @@ This should be the start of a new series of *hopefully optimal* NVFP4 quantizati
 |-----------|--------|
 | Base model | GLM-4.6 |
 | Quantization | NVFP4 (FP4 microscaling, block = 16, scale = E4M3) |
-| Method | Post-Training Quantization with ModelOpt |
-| Toolchain | TensorRT-Model-Optimizer / ModelOpt for PyTorch |
-| Hardware target | NVIDIA Blackwell / GB200 Tensor Cores |
 | Precision | Weights & activations = FP4 • Scales = FP8 (E4M3) |
 | Maintainer | **REMSP.DEV** |

 |-----------|--------|
 | Base model | GLM-4.6 |
 | Quantization | NVFP4 (FP4 microscaling, block = 16, scale = E4M3) |
+| Method | Post-Training Quantization with LLM Compressor |
+| Toolchain | LLM Compressor |
+| Hardware target | NVIDIA Blackwell(Untested on RTX cards) / GB200 Tensor Cores |
 | Precision | Weights & activations = FP4 • Scales = FP8 (E4M3) |
 | Maintainer | **REMSP.DEV** |