Constant rate limit 429 error while trying provided code with vllm

#35
by vchalmel-naomis - opened

Hi ! I've tried to deploy this model using the provided code sample with vllm and pretty much from the start (3 or 4 tries) I'm always blocked by http error 429.

Is there any way to check for its expiration, precautions to take to avoid rate limits ?

Also to prevent this kind of obstacles in the future, what would you recommend to download the model (utlmately inside a docker container) and avoid dowloading from the hub as much as possible ?

Might that be because it's a gated repo ? Then how to properly auth to the hub before using vllm ?

edit : I tried adding my token and got access to the repo, and I still get errors 429 when trying with VLLM and when trying to git clone or using huggingface hub client to download the model

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment