Rui-Jie Zhu's picture

Rui-Jie Zhu

ridger

·

AI & ML interests

None yet

Recent Activity

upvoted a collection 1 day ago

Nemotron-Cascade 2

liked a dataset 11 days ago

stepfun-ai/Step-3.5-Flash-SFT

upvoted a collection 20 days ago

View all activity

Organizations

upvoted a collection 1 day ago

Nemotron-Cascade 2

Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation • 4 items • Updated 1 day ago • 37

liked a dataset 11 days ago

stepfun-ai/Step-3.5-Flash-SFT

Viewer • Updated 12 days ago • 1.62M • 42.3k • 283

upvoted a collection 20 days ago

Qwen3.5

21 items • Updated 17 days ago • 1.32k

upvoted a paper 20 days ago

Helios: Real Real-Time Long Video Generation Model

Paper • 2603.04379 • Published 22 days ago • 176

liked a model 23 days ago

kernels-community/causal-conv1d

Updated about 14 hours ago • 745 • 3

New activity in ByteDance/Ouro-1.4B-Thinking 28 days ago

Update rope embeddings for rope_type='default'

#3 opened about 1 month ago by

New activity in ByteDance/Ouro-2.6B-Thinking 28 days ago

Updated ids for bos_id, eos_id

#4 opened about 1 month ago by

Added 'pad_token_id'.

#5 opened about 1 month ago by

rope_type='default' excluded from ROPE_INIT_FUNCTIONS in transfomers >=5.0

#6 opened about 1 month ago by

Fix bos/eos token IDs + add enable_thinking to chat template

#7 opened about 1 month ago by

Fix UniversalTransformerCache.get_mask_sizes for batched generation

#8 opened about 1 month ago by

New activity in ByteDance/Ouro-1.4B-Thinking 28 days ago

Fix bos/eos token IDs + add enable_thinking to chat template

#4 opened about 1 month ago by

Fix UniversalTransformerCache.get_mask_sizes for batched generation

#5 opened about 1 month ago by

authored a paper about 2 months ago

LoopViT: Scaling Visual ARC with Looped Transformers

Paper • 2602.02156 • Published Feb 2 • 12

upvoted 5 papers about 2 months ago

Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation

Paper • 2602.03619 • Published Feb 3 • 27

LoopViT: Scaling Visual ARC with Looped Transformers

Paper • 2602.02156 • Published Feb 2 • 12

Kimi K2.5: Visual Agentic Intelligence

Paper • 2602.02276 • Published Feb 2 • 259

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation

Paper • 2601.21420 • Published Jan 29 • 42

DeepSeek-OCR 2: Visual Causal Flow

Paper • 2601.20552 • Published Jan 28 • 66

New activity in ByteDance/Ouro-1.4B 2 months ago

difference definition of eos token id in config.json and tokenizer_config.json

#7 opened 3 months ago by