Fynn Kröger's picture

Fynn Kröger

fynnkroeger

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 19 days ago

DDT: Decoupled Diffusion Transformer

upvoted a paper about 1 month ago

Defeating Prompt Injections by Design

reacted to cogwheelhead's post with 👍 2 months ago

Me and my team have performed an in-depth investigation comparing o1 to R1 (and other reasoning models) Link: https://toloka.ai/blog/r1-is-not-on-par-with-o1-and-the-difference-is-qualitative-not-quantitative It started with us evaluating them on our own university-math benchmarks: U-MATH for problem-solving and μ-MATH for judging solution correctness (see the HF leaderboard: https://huggingface.co/spaces/toloka/u-math-leaderboard) tl;dr: R1 sure is amazing, but what we find is that it lags behind in novelty adaptation and reliability: * performance drops when updating benchmarks with fresh unseen tasks (e.g. AIME 2024 -> 2025) * R1-o1 gap widens when evaluating niche subdomains (e.g. university-specific math instead of the more common Olympiad-style contests) * same with going into altogether unconventional domains (e.g. chess) or skills (e.g. judgment instead of problem-solving) * R1 also runs into failure modes way more often (e.g. making illegal chess moves or falling into endless generation loops) Our point here is not to bash on DeepSeek — they've done exceptional work, R1 is a game-changer, and we have no intention to downplay that. R1's release is a perfect opportunity to study where all these models differ and gain understanding on how to move forward from here

View all activity

Organizations

None yet

fynnkroeger's activity

upvoted a paper 19 days ago

DDT: Decoupled Diffusion Transformer

Paper • 2504.05741 • Published 21 days ago • 73

upvoted a paper about 1 month ago

Defeating Prompt Injections by Design

Paper • 2503.18813 • Published Mar 24 • 20

upvoted a paper 2 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 143

upvoted a paper 5 months ago

Multimodal Autoregressive Pre-training of Large Vision Encoders

Paper • 2411.14402 • Published Nov 21, 2024 • 47

upvoted 3 papers 6 months ago

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

Paper • 2411.07126 • Published Nov 11, 2024 • 31

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 95

Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation

Paper • 2410.13848 • Published Oct 17, 2024 • 35

upvoted 3 papers 7 months ago

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24, 2024 • 43

MaskBit: Embedding-free Image Generation via Bit Tokens

Paper • 2409.16211 • Published Sep 24, 2024 • 17

Kolmogorov-Arnold Transformer

Paper • 2409.10594 • Published Sep 16, 2024 • 46

upvoted 8 papers 8 months ago

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3, 2024 • 80

VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters

Paper • 2408.17253 • Published Aug 30, 2024 • 40

Law of Vision Representation in MLLMs

Paper • 2408.16357 • Published Aug 29, 2024 • 96

Scalable Autoregressive Image Generation with Mamba

Paper • 2408.12245 • Published Aug 22, 2024 • 27

Towards Conversational Diagnostic AI

Paper • 2401.05654 • Published Jan 11, 2024 • 20

MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning

Paper • 2408.11001 • Published Aug 20, 2024 • 12

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Paper • 2408.11039 • Published Aug 20, 2024 • 61

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

Paper • 2408.08459 • Published Aug 15, 2024 • 46

upvoted 2 papers 9 months ago

POA: Pre-training Once for Models of All Sizes

Paper • 2408.01031 • Published Aug 2, 2024 • 29

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 115