-
-
-
-
-
-
Inference Providers
Active filters:
trl
Evan-Lin/Bart-large-abs-yelp-inferable
Reinforcement Learning
•
Updated
•
39
Evan-Lin/Bart-large-abs-yelp-inferable-2
Reinforcement Learning
•
Updated
•
11
lvwerra/starcoderbase-gsm8k
Text Generation
•
16B
•
Updated
•
40
approach0/mathy-vicuna-13B-FFT-queryLM-adapter
Reinforcement Learning
•
Updated
Evan-Lin/yelp-attractive-1
Reinforcement Learning
•
Updated
•
27
Evan-Lin/yelp-attractive-3
Reinforcement Learning
•
Updated
•
11
Evan-Lin/yelp-attractive-2
Reinforcement Learning
•
Updated
•
49
Evan-Lin/yelp-attractive-4
Reinforcement Learning
•
Updated
•
47
Evan-Lin/yelp-attractive-keyword-1
Reinforcement Learning
•
Updated
•
11
Evan-Lin/yelp-attractive-large-1
Reinforcement Learning
•
Updated
•
13
amirabdullah19852020/pythia-160m_sentiment_reward
Reinforcement Learning
•
Updated
•
29
amirabdullah19852020/pythia-70m_sentiment_reward
Reinforcement Learning
•
Updated
•
26
amirabdullah19852020/pythia-410m_sentiment_reward
Reinforcement Learning
•
Updated
•
16
amirabdullah19852020/pythia-70m_utility_reward
Reinforcement Learning
•
0.1B
•
Updated
•
24
amirabdullah19852020/pythia-160m_utility_reward
Reinforcement Learning
•
Updated
•
21
amirabdullah19852020/pythia-410m_utility_reward
Reinforcement Learning
•
Updated
•
23
amirabdullah19852020/gpt-neo-125m_sentiment_reward
Reinforcement Learning
•
Updated
•
57
amirabdullah19852020/gpt-neo-125m_utility_reward
Reinforcement Learning
•
Updated
•
79
amirabdullah19852020/gpt-j-6b-sharded-bf16_sentiment_reward
Reinforcement Learning
•
Updated
thanhduc1180/llama2chatbot
Text Generation
•
4B
•
Updated
•
13
EddyGiusepe/zephyr-support-chatbot
Updated
•
15
zahid0/flan-t5-base-ppo
Reinforcement Learning
•
Updated
ARahul2003/lamini_flan_t5_detoxify_rlaif
Text Generation
•
0.2B
•
Updated
•
18
•
2
alignment-handbook/zephyr-7b-sft-full
Text Generation
•
7B
•
Updated
•
4.23k
•
•
26
alignment-handbook/zephyr-7b-sft-qlora
Updated
•
860
•
8
lewtun/zephyr-7b-dpo-full
Text Generation
•
7B
•
Updated
•
16
alignment-handbook/zephyr-7b-dpo-full
Text Generation
•
7B
•
Updated
•
27
•
3
alignment-handbook/zephyr-7b-dpo-qlora
Updated
•
30
•
9
neerajsp23/mistral-finetuned-samsum
Updated
•
11
llm-wizard/llama2_instruct_generation
Updated
•
10