Constant rate limit 429 error while trying provided code with vllm
#35
by
vchalmel-naomis
- opened
Hi ! I've tried to deploy this model using the provided code sample with vllm and pretty much from the start (3 or 4 tries) I'm always blocked by http error 429.
Is there any way to check for its expiration, precautions to take to avoid rate limits ?
Also to prevent this kind of obstacles in the future, what would you recommend to download the model (utlmately inside a docker container) and avoid dowloading from the hub as much as possible ?
Might that be because it's a gated repo ? Then how to properly auth to the hub before using vllm ?
edit : I tried adding my token and got access to the repo, and I still get errors 429 when trying with VLLM and when trying to git clone or using huggingface hub client to download the model