-
-
-
-
-
-
Inference Providers
Active filters:
trl
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-2
Reinforcement Learning
•
Updated
•
39
dshin/flan-t5-ppo-user-a-batch-size-8-epoch-2
Reinforcement Learning
•
Updated
•
73
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-2
Reinforcement Learning
•
Updated
•
40
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-2-use-violation
Reinforcement Learning
•
Updated
•
16
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-2-use-violation
Reinforcement Learning
•
Updated
•
29
dshin/flan-t5-ppo-user-a-batch-size-8-epoch-3
Reinforcement Learning
•
Updated
•
25
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-2
Reinforcement Learning
•
Updated
•
43
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-3
Reinforcement Learning
•
Updated
•
14
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-3-use-violation
Reinforcement Learning
•
Updated
•
44
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-3
Reinforcement Learning
•
Updated
•
15
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-2-use-violation
Reinforcement Learning
•
Updated
•
14
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-3-use-violation
Reinforcement Learning
•
Updated
•
15
dshin/flan-t5-ppo-user-a-batch-size-8-epoch-4
Reinforcement Learning
•
Updated
•
17
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-3
Reinforcement Learning
•
Updated
•
39
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-4
Reinforcement Learning
•
Updated
•
14
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-4-use-violation
Reinforcement Learning
•
Updated
•
16
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-4
Reinforcement Learning
•
Updated
•
70
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-3-use-violation
Reinforcement Learning
•
Updated
•
16
dshin/flan-t5-ppo-user-h-batch-size-8-epoch-4-use-violation
Reinforcement Learning
•
Updated
•
15
dshin/flan-t5-ppo-user-f-batch-size-8-epoch-4
Reinforcement Learning
•
Updated
•
15
dshin/flan-t5-ppo-user-e-batch-size-8-epoch-4-use-violation
Reinforcement Learning
•
Updated
•
16
SummerSigh/T5-Base-Rule-Of-Thumb-RM2
Reinforcement Learning
•
0.2B
•
Updated
•
16
dshin/flan-t5-ppo-user-h-batch-size-64
Reinforcement Learning
•
Updated
•
70
dshin/flan-t5-ppo-user-f-batch-size-64
Reinforcement Learning
•
Updated
•
15
dshin/flan-t5-ppo-user-f-batch-size-64-use-violation
Reinforcement Learning
•
Updated
•
16
dshin/flan-t5-ppo-user-h-batch-size-64-use-violation
Reinforcement Learning
•
Updated
•
59
dshin/flan-t5-ppo-user-e-batch-size-64-use-violation
Reinforcement Learning
•
Updated
•
28
dshin/flan-t5-ppo-user-e-batch-size-64
Reinforcement Learning
•
Updated
•
38
trl-lib/llama-7b-se-peft
Bearnardd/gpt2-imdb
Reinforcement Learning
•
Updated
•
27