Inui's picture

Inui

Norm

·

https://normxu.github.io/

AI & ML interests

Video Diffusion; Large Language Model; Object Detection; OCR

Recent Activity

upvoted a paper about 3 hours ago

LongCat-Flash-Thinking-2601 Technical Report

liked a model 10 days ago

meituan-longcat/LongCat-Flash-Thinking-2601

liked a dataset 28 days ago

wsdwJohn1231/DreamLIP_capion_csv_w_key

View all activity

Organizations

upvoted a paper about 3 hours ago

LongCat-Flash-Thinking-2601 Technical Report

Paper • 2601.16725 • Published 3 days ago • 130

upvoted 2 papers 3 months ago

Revisiting Multimodal Positional Encoding in Vision-Language Models

Paper • 2510.23095 • Published Oct 27, 2025 • 21

LongCat-Flash-Omni Technical Report

Paper • 2511.00279 • Published Oct 31, 2025 • 24

upvoted a paper 4 months ago

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 507

upvoted 2 papers 5 months ago

VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26, 2025 • 143

Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models

Paper • 2508.09138 • Published Aug 12, 2025 • 37

upvoted 3 papers 6 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 316

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

Paper • 2507.19457 • Published Jul 25, 2025 • 29

Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding

Paper • 2507.19427 • Published Jul 25, 2025 • 19

upvoted 4 papers 7 months ago

SingLoRA: Low Rank Adaptation Using a Single Matrix

Paper • 2507.05566 • Published Jul 8, 2025 • 115

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19, 2025 • 130

Stream-Omni: Simultaneous Multimodal Interactions with Large Language-Vision-Speech Model

Paper • 2506.13642 • Published Jun 16, 2025 • 27

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16, 2025 • 273

upvoted 4 papers 8 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9, 2025 • 263

UniWorld: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation

Paper • 2506.03147 • Published Jun 3, 2025 • 58

One-shot Entropy Minimization

Paper • 2505.20282 • Published May 26, 2025 • 6

MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21, 2025 • 97

upvoted 3 papers 9 months ago

BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset

Paper • 2505.09568 • Published May 14, 2025 • 99

Flow-GRPO: Training Flow Matching Models via Online RL

Paper • 2505.05470 • Published May 8, 2025 • 87

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Paper • 2505.00703 • Published May 1, 2025 • 44