
sbintuitions/sarashina2.2-3b-instruct-v0.1
Text Generation
•
Updated
•
18.3k
•
•
19
Note 2025-03 This is a 3B model, but they claim it is comparable to a 7B class perf, so let's see: https://www.sbintuitions.co.jp/blog/entry/2025/03/07/093143
Note 2025-02 ; 4096 context window, test with `VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 vllm serve llm-jp/llm-jp-3-7.2b-instruct3 --max-model-len 8192 --rope-scaling '{"rope_type":"dynamic","factor":2.0}`