view post Post 2042 I've been working on something cool: a GRPO with an LLM evaluator that can also perform SFT on the feedback data - if you want. Check it out πAny πare more than welcome π€https://github.com/mkurman/grpo-llm-evaluator See translation π 5 5 + Reply
Addition is All You Need for Energy-efficient Language Models Paper β’ 2410.00907 β’ Published Oct 1, 2024 β’ 149