Kai Zuberbühler's picture

856 323

Kai Zuberbühler

kaizuberbuehler

·

k-zubi

AI & ML interests

language models, agents, image generation, music generation

Recent Activity

updated a Space 23 days ago

kaizuberbuehler/ai-progress-charts

updated a collection about 1 month ago

Reasoning, Thinking, RL and Test-Time Scaling

updated a collection about 1 month ago

LM Capabilities and Scaling

View all activity

Organizations

None yet

upvoted 20 papers about 1 month ago

Rethinking Reflection in Pre-Training

Paper • 2504.04022 • Published Apr 5 • 80

Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought

Paper • 2504.05599 • Published Apr 8 • 86

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2 • 86

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7 • 109

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Paper • 2504.06261 • Published Apr 8 • 111

Kimi-VL Technical Report

Paper • 2504.07491 • Published Apr 10 • 134

OmniSVG: A Unified Scalable Vector Graphics Generation Model

Paper • 2504.06263 • Published Apr 8 • 181

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published Apr 7 • 197

MedAgent-Pro: Towards Multi-modal Evidence-based Medical Diagnosis via Reasoning Agentic Workflow

Paper • 2503.18968 • Published Mar 21 • 7

VerifiAgent: a Unified Verification Agent in Language Model Reasoning

Paper • 2504.00406 • Published Apr 1 • 9

Reasoning-SQL: Reinforcement Learning with SQL Tailored Partial Rewards for Reasoning-Enhanced Text-to-SQL

Paper • 2503.23157 • Published Mar 29 • 11

Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead

Paper • 2504.00294 • Published Mar 31 • 11

m1: Unleash the Potential of Test-Time Scaling for Medical Reasoning with Large Language Models

Paper • 2504.00869 • Published Apr 1 • 11

Scaling Laws in Scientific Discovery with AI and Robot Scientists

Paper • 2503.22444 • Published Mar 28 • 13

ActionStudio: A Lightweight Framework for Data and Training of Large Action Models

Paper • 2503.22673 • Published Mar 28 • 13

DASH: Detection and Assessment of Systematic Hallucinations of VLMs

Paper • 2503.23573 • Published Mar 30 • 13

Interpreting Emergent Planning in Model-Free Reinforcement Learning

Paper • 2504.01871 • Published Apr 2 • 13

GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning

Paper • 2504.00891 • Published Apr 1 • 14

When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning

Paper • 2504.01005 • Published Apr 1 • 16

OpenCodeReasoning: Advancing Data Distillation for Competitive Coding

Paper • 2504.01943 • Published Apr 2 • 16