Phi's picture

2 34 4

Phi PRO

Xalphinions

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 7 days ago

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

liked a model 10 days ago

miromind-ai/MiroThinker-v1.0-72B

commented on a paper 26 days ago

The Principles of Diffusion Models

View all activity

Organizations

upvoted a paper 7 days ago

OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe

Paper • 2511.16334 • Published 10 days ago • 86

upvoted a paper 26 days ago

The Principles of Diffusion Models

Paper • 2510.21890 • Published Oct 24 • 58

upvoted a paper about 2 months ago

First Try Matters: Revisiting the Role of Reflection in Reasoning Models

Paper • 2510.08308 • Published Oct 9 • 24

upvoted 2 papers 3 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Paper • 2508.14896 • Published Aug 20 • 22

upvoted a paper 4 months ago

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 133

upvoted 4 papers 6 months ago

Lingshu: A Generalist Foundation Model for Unified Multimodal Medical Understanding and Reasoning

Paper • 2506.07044 • Published Jun 8 • 113

SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training

Paper • 2506.05301 • Published Jun 5 • 56

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 185

Satori-SWE: Evolutionary Test-Time Scaling for Sample-Efficient Software Engineering

Paper • 2505.23604 • Published May 29 • 23

upvoted 2 papers 10 months ago

Fast Video Generation with Sliding Tile Attention

Paper • 2502.04507 • Published Feb 6 • 51

s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 124

upvoted 2 papers 12 months ago

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Paper • 2412.14161 • Published Dec 18, 2024 • 51

Reverse Thinking Makes LLMs Stronger Reasoners

Paper • 2411.19865 • Published Nov 29, 2024 • 23

upvoted 6 papers about 1 year ago

Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs

Paper • 2410.18451 • Published Oct 24, 2024 • 20

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 93

The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

Paper • 2410.12787 • Published Oct 16, 2024 • 31

SimBa: Simplicity Bias for Scaling Up Parameters in Deep Reinforcement Learning

Paper • 2410.09754 • Published Oct 13, 2024 • 8

Pixtral 12B

Paper • 2410.07073 • Published Oct 9, 2024 • 68

Is Preference Alignment Always the Best Option to Enhance LLM-Based Translation? An Empirical Analysis

Paper • 2409.20059 • Published Sep 30, 2024 • 17