-
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 30 -
Benjamin-eecs/Llama-3.1-8B-Instruct-NLRL-TicTacToe-Value
Feature Extraction • 8B • Updated • 4 -
Benjamin-eecs/Llama-3.1-8B-Instruct-NLRL-TicTacToe-Policy
Feature Extraction • 8B • Updated • 5 -
Waterhorse/Llama-3.1-8B-Instruct-NLRL-Breakthrough-Value
Feature Extraction • 8B • Updated • 9
Bo Liu
Benjamin-eecs
AI & ML interests
Reinforcement Learning, Reasoning, Machine Learning Systems
Recent Activity
upvoted
a
paper
10 days ago
Bootstrapping Task Spaces for Self-Improvement
authored
a paper
17 days ago
LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model
upvoted
a
collection
17 days ago
LLaVA-Critic-R1