Ksenia Se
Kseniase
Β·
AI & ML interests
None yet
Recent Activity
reacted
to
their
post
with π
2 days ago
16 new research on inference-time scaling:
For the last couple of weeks a large amount of studies on inference-time scaling has emerged. And it's so cool, because each new paper adds a trick to the toolbox, making LLMs more capable without needing to scale parameter count of the models.
So here are 13 new methods + 3 comprehensive studies on test-time scaling:
1. https://huggingface.co/papers/2504.02495
Probably, the most popular study. It proposes to boost inference-time scalability by improving reward modeling. To enhance performance, DeepSeek-GRM uses adaptive critiques, parallel sampling, pointwise generative RM, and Self-Principled Critique Tuning (SPCT)
2. https://huggingface.co/papers/2504.04718
Allows small models to use external tools, like code interpreters and calculator, to enhance self-verification
3. https://huggingface.co/papers/2504.00810
Proposes to train LLMs on code-based reasoning paths to make test-time scaling more efficient, limiting unnecessary tokens with a special dataset and a Shifted Thinking Window
4. https://huggingface.co/papers/2504.00891
Introduces GenPRM, a generative PRM, that uses CoT reasoning and code verification for step-by-step judgment. With only 23K training examples, GenPRM outperforms prior PRMs and larger models
5. https://huggingface.co/papers/2503.24320
SWIFT test-time scaling framework improves World Models' performance without retraining, using strategies like fast tokenization, Top-K pruning, and efficient beam search
6. https://huggingface.co/papers/2504.07104
Proposes REBEL for RAG systems scaling, which uses multi-criteria optimization with CoT prompting for better performance-speed tradeoffs as inference compute increases
7. https://huggingface.co/papers/2503.13288
Proposes a Ο-Decoding strategy that uses foresight sampling, clustering and adaptive pruning to estimate and select optimal reasoning steps
Read further below π
Also, subscribe to the Turing Post https://www.turingpost.com/subscribe
replied to
their
post
3 days ago
16 new research on inference-time scaling:
For the last couple of weeks a large amount of studies on inference-time scaling has emerged. And it's so cool, because each new paper adds a trick to the toolbox, making LLMs more capable without needing to scale parameter count of the models.
So here are 13 new methods + 3 comprehensive studies on test-time scaling:
1. https://huggingface.co/papers/2504.02495
Probably, the most popular study. It proposes to boost inference-time scalability by improving reward modeling. To enhance performance, DeepSeek-GRM uses adaptive critiques, parallel sampling, pointwise generative RM, and Self-Principled Critique Tuning (SPCT)
2. https://huggingface.co/papers/2504.04718
Allows small models to use external tools, like code interpreters and calculator, to enhance self-verification
3. https://huggingface.co/papers/2504.00810
Proposes to train LLMs on code-based reasoning paths to make test-time scaling more efficient, limiting unnecessary tokens with a special dataset and a Shifted Thinking Window
4. https://huggingface.co/papers/2504.00891
Introduces GenPRM, a generative PRM, that uses CoT reasoning and code verification for step-by-step judgment. With only 23K training examples, GenPRM outperforms prior PRMs and larger models
5. https://huggingface.co/papers/2503.24320
SWIFT test-time scaling framework improves World Models' performance without retraining, using strategies like fast tokenization, Top-K pruning, and efficient beam search
6. https://huggingface.co/papers/2504.07104
Proposes REBEL for RAG systems scaling, which uses multi-criteria optimization with CoT prompting for better performance-speed tradeoffs as inference compute increases
7. https://huggingface.co/papers/2503.13288
Proposes a Ο-Decoding strategy that uses foresight sampling, clustering and adaptive pruning to estimate and select optimal reasoning steps
Read further below π
Also, subscribe to the Turing Post https://www.turingpost.com/subscribe
View all activity
Organizations
Kseniase's activity
-
-
-
-
-
-
-
-
-
-
-
view article
Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained β Whatβs Really Changing in Transformers?
view article
What is Qwen-Agent framework? Inside the Qwen family
view article
π#92: Fight for Developers and the Year of Orchestration
view article
π¦Έπ»#14: What Is MCP, and Why Is Everyone β Suddenly!β Talking About It?
published
an
article
about 1 month ago
published
an
article
about 1 month ago
published
an
article
about 1 month ago
view article
π#90: Why AIβs Reasoning Tests Keep Failing Us
published
an
article
about 1 month ago
view article
π¦Έπ»#13: Action! How AI Agents Execute Tasks with UI and API Tools
published
an
article
about 1 month ago
view article
π¦Έπ»#12: How Do Agents Learn from Their Own Mistakes? The Role of Reflection in AI
published
an
article
about 1 month ago
view article
Everything You Need to Know about Knowledge Distillation
published
an
article
about 2 months ago
published
an
article
about 2 months ago
view article
π#89: AI in Action: How AI Engineers, Self-Optimizing Models, and Humanoid Robots Are Reshaping 2025
published
an
article
about 2 months ago
published
an
article
about 2 months ago
published
an
article
about 2 months ago
view article
π#88: Can DeepSeek Inspire Global Collaboration?
published
an
article
about 2 months ago
view article
π¦Έπ»#10: Does Present-Day GenAI Actually Reason?
view article
Topic 27: What are Chain-of-Agents and Chain-of-RAG?
view article
π#87: Why DeepResearch Should Be Your New Hire