Macro-Action RLHF - a ernie-research Collection

ernie-research 's Collections

updated 4 days ago

[ICLR'25] [MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions](https://openreview.net/forum?id=WWXjMYZxfH)