InnoMegrez
AI & ML interests
SII is an institution dedicated to innovation in education and research
Recent Activity
View all activity
InnoSpark
What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training.
-
sii-research/OctoThinker-8B-Long-Base
Text Generation • 8B • Updated • 21 • 1 -
sii-research/OctoThinker-8B-Hybrid-Base
Text Generation • 8B • Updated • 15 -
sii-research/OctoThinker-8B-Short-Base
Text Generation • 8B • Updated • 13 -
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling
Paper • 2506.20512 • Published • 46
InnoMegrez
InnoSpark
DigitalGene
What makes a base language model suitable for RL? Through controlled experiments, we identify key factors then leverage them to scale up mid-training.
-
sii-research/OctoThinker-8B-Long-Base
Text Generation • 8B • Updated • 21 • 1 -
sii-research/OctoThinker-8B-Hybrid-Base
Text Generation • 8B • Updated • 15 -
sii-research/OctoThinker-8B-Short-Base
Text Generation • 8B • Updated • 13 -
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling
Paper • 2506.20512 • Published • 46