Edit Models filters

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

518

Full-text search

Active filters: RLHF

wangclnlp/GRAM-RR-LLaMA-3.2-3B-RewardModel

Text Generation • 3B • Updated about 1 month ago • 8

mradermacher/GRAM-RR-LLaMA-3.2-3B-RewardModel-GGUF

3B • Updated about 1 month ago • 457

mradermacher/GRAM-RR-LLaMA-3.2-3B-RewardModel-i1-GGUF

3B • Updated about 1 month ago • 694

mradermacher/GRAM-RR-LLaMA-3.1-8B-RewardModel-GGUF

8B • Updated about 1 month ago • 242

mradermacher/GRAM-RR-LLaMA-3.1-8B-RewardModel-i1-GGUF

8B • Updated about 1 month ago • 750

mradermacher/OpenBioLLm-70B-GGUF

71B • Updated 27 days ago • 4.78k

mradermacher/OpenBioLLm-70B-i1-GGUF

71B • Updated 27 days ago • 22.6k

HYDARIM7/SmolLM2_RLHF_PPO_HY

Reinforcement Learning • 0.1B • Updated 13 days ago • 21