TreePO - a m-a-p Collection

m-a-p 's Collections

TreePO

Hybrid Linear Attention Research

MARBLE

COIG-P-Datasets

YuE

MERT

MuPT

COIG

OpenCodeInterpreter

M-A-P Full Paper List

Amber-Reproduce-Intermediate-CKPTs (The Fine Line)

OpenLLaMA-Reproduce-Intermediate-CKPTs (The Fine Line)

Chinese Tiny LLM

TreePO

updated 3 days ago

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published 14 days ago • 78
m-a-p/TreePO-Qwen2.5-7B

8B • Updated 8 days ago • 14 • 2
m-a-p/TreePO_data

Viewer • Updated 8 days ago • 49.3k • 110
m-a-p/TreePO-Qwen2.5-7B_fixed-div

8B • Updated 8 days ago • 16
m-a-p/TreePO-Qwen2.5-7B_GRPO-TreePO-Sampling

8B • Updated 3 days ago • 9
m-a-p/TreePO-Qwen2.5-7B_Low_Prob_Encourage

8B • Updated 3 days ago • 8
m-a-p/TreePO-Qwen2.5-7B_Naive2Low_Scheduler

8B • Updated 3 days ago • 7