mratsim commited on
Commit
e53f204
·
verified ·
1 Parent(s): ecd6191

copy-paste woes - NVFP4A16 can be run without hardware NVFP4

Browse files
Files changed (1) hide show
  1. README.md +0 -5
README.md CHANGED
@@ -35,11 +35,6 @@ NVFP4 writeups:
35
 
36
  The model was tested with vLLM + 1x or 2x RTX Pro 6000, here is a script suitable for such configuration with 131072 context length.
37
 
38
- ### Hardware
39
-
40
- As of October 2025, this quantized model can only be run on architectures with hardware FP4 support (Blackwell or later).
41
- Cheaper GPUs with 24GB of VRAM (RTX 5080 Super) that can run this model in pairs are expected in Q1 2026.
42
-
43
  ### Recommendations
44
 
45
  It is however recommended to use only 65K context to avoid significant degradation (https://fiction.live/stories/Fiction-liveBench-Sept-29-2025/oQdzQvKHw8JyXbN87)
 
35
 
36
  The model was tested with vLLM + 1x or 2x RTX Pro 6000, here is a script suitable for such configuration with 131072 context length.
37
 
 
 
 
 
 
38
  ### Recommendations
39
 
40
  It is however recommended to use only 65K context to avoid significant degradation (https://fiction.live/stories/Fiction-liveBench-Sept-29-2025/oQdzQvKHw8JyXbN87)