Added VLLM Offline Serve working code.
#107
by
hrithiksagar-tih
- opened
So, in this commit, I have attached the solution to the OSS 20b model inference code via vLLM. The original code in the cookbook: https://cookbook.openai.com/articles/gpt-oss/run-vllm, was not working; with a few modifications, it worked.