Added VLLM Offline Serve working code.

#107
by hrithiksagar-tih - opened

So, in this commit, I have attached the solution to the OSS 20b model inference code via vLLM. The original code in the cookbook: https://cookbook.openai.com/articles/gpt-oss/run-vllm, was not working; with a few modifications, it worked.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment