Quantization question

#1
by cruzanstx - opened

What was the specs on the machine you ran the quantization script on?

Also you mentioned modifying the code for calculate_offload_device_map. Can you elaborate?

I was using a machine with 8xA100 80GB and 2TB system RAM.

Somewhere in calculate_offload_device_map I limited the RAM allocation per GPU to 50-60%.

Sign up or log in to comment