-
-
-
-
-
-
Inference Providers
Active filters:
trl
Evan-Lin/yelp-attractive-1
Reinforcement Learning
•
Updated
•
5
Evan-Lin/yelp-attractive-3
Reinforcement Learning
•
Updated
•
5
Evan-Lin/yelp-attractive-2
Reinforcement Learning
•
Updated
•
5
Evan-Lin/yelp-attractive-4
Reinforcement Learning
•
Updated
•
5
Evan-Lin/yelp-attractive-keyword-1
Reinforcement Learning
•
Updated
•
5
Evan-Lin/yelp-attractive-large-1
Reinforcement Learning
•
Updated
•
5
amirabdullah19852020/pythia-160m_sentiment_reward
Reinforcement Learning
•
Updated
•
6
amirabdullah19852020/pythia-70m_sentiment_reward
Reinforcement Learning
•
Updated
•
6
amirabdullah19852020/pythia-410m_sentiment_reward
Reinforcement Learning
•
Updated
•
5
amirabdullah19852020/pythia-70m_utility_reward
Reinforcement Learning
•
0.1B
•
Updated
•
20
amirabdullah19852020/pythia-160m_utility_reward
Reinforcement Learning
•
Updated
•
8
amirabdullah19852020/pythia-410m_utility_reward
Reinforcement Learning
•
Updated
•
5
amirabdullah19852020/gpt-neo-125m_sentiment_reward
Reinforcement Learning
•
Updated
•
1
amirabdullah19852020/gpt-neo-125m_utility_reward
Reinforcement Learning
•
Updated
•
2
amirabdullah19852020/gpt-j-6b-sharded-bf16_sentiment_reward
Reinforcement Learning
•
Updated
thanhduc1180/llama2chatbot
Text Generation
•
4B
•
Updated
•
7
EddyGiusepe/zephyr-support-chatbot
zahid0/flan-t5-base-ppo
Reinforcement Learning
•
Updated
ARahul2003/lamini_flan_t5_detoxify_rlaif
Text Generation
•
0.2B
•
Updated
•
2
•
2
alignment-handbook/zephyr-7b-sft-full
Text Generation
•
7B
•
Updated
•
4.4k
•
•
26
alignment-handbook/zephyr-7b-sft-qlora
Updated
•
103
•
8
lewtun/zephyr-7b-dpo-full
Text Generation
•
7B
•
Updated
•
7
alignment-handbook/zephyr-7b-dpo-full
Text Generation
•
7B
•
Updated
•
77
•
3
alignment-handbook/zephyr-7b-dpo-qlora
Updated
•
33
•
9
neerajsp23/mistral-finetuned-samsum
llm-wizard/llama2_instruct_generation
worde-byte/finetunemistral
Updated
•
16
stuser2023/Llama2-7b-finetuned
Text Generation
•
7B
•
Updated
•
3
•
1
Lichang-Chen/zephyr-7b-sft-full
Text Generation
•
7B
•
Updated
•
5
llm-wizard/sft_zephyr