量化类型

#11
by a-r-c - opened

感谢开源,救了没有H显卡人的命,有个小问题,这个是W8A16的还是W8A8的呢

meituan org

W8A8,weight的量化粒度遵循了原本FP8的block-wise,每128x128为一个block

pkumc changed discussion status to closed

Sign up or log in to comment