Arctic-LSTM-Speculator-HyperCLOVAX-SEED-Think-14B
Instruction
This repo contains LSTM-Speculator files for HyperCLOVAX-SEED-Think-14B.
Arctic-LSTM-Speculator-HyperCLOVAX-SEED-Think-14B was trained using ArcticTraining 0.6.0, following the guide.
Model Configuration
- Original model: naver-hyperclovax/HyperCLOVAX-SEED-Think-14B
- Speculator: Arctic-LSTM-Speculator-HyperCLOVAX-SEED-Think-14B
Quickstart
pip install arctic-inference[vllm]
python3 -m vllm.entrypoints.openai.api_server --model=naver-hyperclovax/HyperCLOVAX-SEED-Think-14B --trust_remote_code --port=8000 --speculative-config='{"method": "arctic","model": "K-Compression/Arctic-LSTM-Speculator-HyperCLOVAX-SEED-Think-14B"}'
Performance
We compare the output token throughput (tokens/s) of vLLM-based standard decoding and speculative decoding for HyperCLOVA X SEED 14B Think on a single H100 GPU as shown below:
HyperCLOVA X SEED 14B Think | ShareGPT (tokens/s) |
---|---|
No speculation | 84.40 |
Arctic Speculator | 115.94((1.4x faster)) |
License
The model is licensed under HyperCLOVA X SEED Model License Agreement
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for K-Compression/Arctic-LSTM-Speculator-HyperCLOVAX-SEED-Think-14B
Base model
naver-hyperclovax/HyperCLOVAX-SEED-Think-14B