1 3 103

Vola Delta

hf-delta

AI & ML interests

LLM

Recent Activity

liked a model 29 days ago

moonshotai/Kimi-Dev-72B

reacted to codelion's post with 🔥 about 2 months ago

Introducing AutoThink: Adaptive reasoning for LLMs that improves performance by 43% on reasoning benchmarks! Instead of using fixed thinking budgets, AutoThink: - Classifies query complexity (HIGH/LOW) using adaptive classification - Dynamically allocates thinking tokens based on complexity - Uses steering vectors derived from Pivotal Token Search to guide reasoning patterns Results on DeepSeek-R1-Distill-Qwen-1.5B: - GPQA-Diamond: 31.06% vs 21.72% baseline (+9.34 points) - MMLU-Pro: 26.38% vs 25.58% baseline (+0.8 points) - Uses fewer tokens than baseline approaches Works with any local reasoning model - DeepSeek, Qwen, Llama, custom models. The technique combines our research on Pivotal Token Search (PTS) implementation and adaptive classification frameworks. Paper: AutoThink: efficient inference for reasoning LLMs https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5253327 Code and examples: https://github.com/codelion/optillm/tree/main/optillm/autothink PTS implementation and technical details: https://github.com/codelion/pts https://huggingface.co/blog/codelion/pts Adaptive classifier framework: https://github.com/codelion/adaptive-classifier Would love to hear your thoughts on adaptive resource allocation for LLM reasoning! Have you experimented with similar approaches?

updated a dataset 2 months ago

hf-delta/market-data-npz

View all activity

Organizations

None yet

liked a model 29 days ago

moonshotai/Kimi-Dev-72B

Text Generation • 73B • Updated 28 days ago • 19.2k • • 337

reacted to codelion's post with 🔥 about 2 months ago

Post

2350

Introducing AutoThink: Adaptive reasoning for LLMs that improves performance by 43% on reasoning benchmarks!

Instead of using fixed thinking budgets, AutoThink:
- Classifies query complexity (HIGH/LOW) using adaptive classification
- Dynamically allocates thinking tokens based on complexity
- Uses steering vectors derived from Pivotal Token Search to guide reasoning patterns

Results on DeepSeek-R1-Distill-Qwen-1.5B:
- GPQA-Diamond: 31.06% vs 21.72% baseline (+9.34 points)
- MMLU-Pro: 26.38% vs 25.58% baseline (+0.8 points)
- Uses fewer tokens than baseline approaches

Works with any local reasoning model - DeepSeek, Qwen, Llama, custom models. The technique combines our research on Pivotal Token Search (PTS) implementation and adaptive classification frameworks.

Paper: AutoThink: efficient inference for reasoning LLMs
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5253327

Code and examples:
https://github.com/codelion/optillm/tree/main/optillm/autothink

PTS implementation and technical details:
https://github.com/codelion/pts
https://huggingface.co/blog/codelion/pts

Adaptive classifier framework:
https://github.com/codelion/adaptive-classifier

Would love to hear your thoughts on adaptive resource allocation for LLM reasoning! Have you experimented with similar approaches?