FP8 weights

#41
by getfit - opened

Push a FP8 release? looks like llmcompressor does not support the arch yet.

Has anyone gotten this to convert ?

Meta Llama org

@getfit : Thanks for your question! We used the llmcompressor recipe to create the FP8 checkpoint for Maverick here: https://huggingface.co/meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8.
We'll confirm with the team with ETA of adding FP8 and INT4 for Scout. cc: @wukaixingxp @Hamid-Nazeri

@yecharlotteqi Are there any updates on this?

Just found this model for anyone looking

https://huggingface.co/RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic

It should work with vLLM but haven't tested it yet

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment