Adaptive Length Penalty
Collection
Teaching language models to think efficiently with Adaptive Length Penalty (ALP)
•
3 items
•
Updated
•
1
DeepScaleR-1.5B trained with Adaptive Length Penalty (ALP) - reduces token usage by ~50% while maintaining performance.
prompt = f"{problem} Let's think step by step and output the final answer within \\boxed{{}}."