Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

411

Full-text search

Active filters: rlhf

samhitha2601/llama3-gsm8k-critic

3B • Updated Oct 24 • 3

AIResAgTeam/Quantum-LIMIT-Graph-v2.4.0-NSN-level-4-maturity-rust

ziadrone/airesupdated-v6

Text Generation • Updated Nov 5 • 1

Uppaal/gpt2-ProFS-toxicity

Text Generation • 0.4B • Updated 30 days ago • 9

Uppaal/gpt-j-ProFS-toxicity

Text Generation • 6B • Updated 30 days ago • 13

Uppaal/opt-ProFS-toxicity

Text Generation • 7B • Updated 30 days ago • 9

Uppaal/Mistral-ProFS-toxicity

Text Generation • 7B • Updated 30 days ago • 8

Uppaal/Mistral-sft-ProFS-toxicity

Text Generation • 7B • Updated 30 days ago • 7

Uppaal/Mistral-ProFS-safety

Text Generation • 7B • Updated 30 days ago • 17

Uppaal/Mistral-sft-ProFS-safety

Text Generation • 7B • Updated 30 days ago • 10

sodeniZz/llm-course-hw2-dpo

Text Generation • 0.1B • Updated 23 days ago • 63

sodeniZz/llm-course-hw2-reward-model

Text Classification • 0.1B • Updated 23 days ago • 91

sodeniZz/llm-course-hw2-ppo

Text Generation • 0.1B • Updated 23 days ago • 79

ahczhg/qwen3-0.6b-rlhf-cot

Text Generation • Updated 21 days ago • 1

ahczhg/Llama-3.2-1B-Aegis-SFT-DPO

Text Generation • 1B • Updated 21 days ago • 37 • 1

mradermacher/Llama-3.2-1B-Aegis-SFT-DPO-GGUF

1B • Updated 23 days ago • 379

nfsrulesFR/mega-grpo

Text Generation • Updated 16 days ago

TzJ2006/JokeGPT-Model

Updated 10 days ago • 10 • 1

FutureMa/Qwen2.5-7B-Instruct-GRPO-Math

Text Generation • Updated 11 days ago

AhmedSSoliman/medgemma-4b-digital-twin-v1

Updated 3 days ago

AhmedSSoliman/gpt-oss-20b-digital-twin-v1

Text Generation • Updated about 3 hours ago