FP8 weights
#41
by
getfit
- opened
Push a FP8 release? looks like llmcompressor does not support the arch yet.
Has anyone gotten this to convert ?
@getfit
: Thanks for your question! We used the llmcompressor recipe to create the FP8 checkpoint for Maverick here: https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8.
We'll confirm with the team with ETA of adding FP8 and INT4 for Scout. cc:
@wukaixingxp
@Hamid-Nazeri
@yecharlotteqi Are there any updates on this?
Just found this model for anyone looking
https://huggingface.co/RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic
It should work with vLLM but haven't tested it yet