New discussion

Serving with TGI or vLLM?

1
#3 opened almost 2 years ago by
kno10

only use one gpu?

2
#2 opened almost 2 years ago by
jgbrblmd

persist dequantized model

1
#1 opened almost 2 years ago by
nudelbrot