Chris Scott
getfit
AI & ML interests
None yet
Recent Activity
new activity
1 day ago
rednote-hilab/dots.llm1.inst:Yarn context
liked
a model
10 days ago
ResembleAI/chatterbox
updated
a model
12 days ago
getfit/orpheus-3b-0.1-ft-FP8-Dynamic
Organizations
None yet
getfit's activity
Yarn context
#3 opened 1 day ago
by
getfit

Quantization question
3
#1 opened 26 days ago
by
cruzanstx
I get errors trying to deploy this in vllm or sglang.
๐
๐
6
3
#1 opened 27 days ago
by
getfit

VLLM, SGLANG
4
#1 opened about 1 month ago
by
getfit

How was this made? Quant configuration? Have you deployed this with SGLANG or vllm ?
1
#1 opened about 1 month ago
by
getfit

Slow inference on vLLM
3
#1 opened about 1 month ago
by
hp1337
Larger version?
1
#2 opened 2 months ago
by
Carthin
FP8 weights
4
#41 opened 2 months ago
by
getfit

Thank you!, Is it possible to run this with vLLM or sglang ?
5
#18 opened 2 months ago
by
getfit

No one with a consumer grade GPU (< 32 vram) can run the lower L4 model... ๐
๐
๐คฏ
10
13
#20 opened 2 months ago
by
UniversalLove333
missing opening <think>
20
#4 opened 3 months ago
by
getfit

Output repeating
๐
1
29
#1 opened 3 months ago
by
getfit

32b Coder
โ
8
8
#5 opened 9 months ago
by
sm54
There's a HUGE drop in popular knowledge from v2 to v2.5.
๐
๐
14
28
#1 opened 9 months ago
by
phil111