Run GPT-OSS-120B with just Single A100 (80GB)
#80
by
ghostplant
- opened
A solution for single A100 (80G) to serve whatever 20B and 120B version: Tutel Instruction to Run GptOSS 120B.
How to start an API service locally
You can also use vllm now: https://github.com/vllm-project/vllm/issues/22290#issuecomment-3165645703