Deploy gpt-oss models in your own AWS account using vLLM and Tensorfuse

#36

by agam30 - opened 9 days ago

agam30

9 days ago

Hi all,

we have released a guide to deploy openai's latest oss models in your own AWS account. What's included:

Optimized dockerfile with the latest vllm-openai:gptoss for both 20b and 120b models
we achieved 240 tps on 1xH100 with 20b model and 200tps on 8xH100 with 120b model
Served with full context length of 130k

get started with tensorfuse here: https://app.tensorfuse.io/

9 days ago

Would be awesome to release metrics on Consumer hardware.

OpenAI org 9 days ago

Hey @agam30 thanks for writing the guide! Feel free to PR it in to add it here https://github.com/openai/gpt-oss/blob/main/awesome-gpt-oss.md

agam30

1 day ago

thanks for reaching out!

Can you get it approved!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment