Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections trending this week

ColBERT-Zero 🐶

First large-scale fully pre-trained ColBERT model using only public data, outperforming GTE-ModernColBERT and GTE-ModernBERT

ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models

Paper • 2602.16609 • Published 7 days ago • 6
lightonai/ColBERT-Zero

Sentence Similarity • 0.1B • Updated 2 days ago • 722 • 24
lightonai/ColBERT-Zero-supervised

Sentence Similarity • 0.1B • Updated 2 days ago • 56 • 3
lightonai/ColBERT-Zero-unsupervised

Sentence Similarity • 0.1B • Updated 2 days ago • 34 • 1

Qwen/Qwen3-235B-A22B-Thinking-2507-FP8

Text Generation • 235B • Updated Jul 30, 2025 • 76.6k • 82
Qwen/Qwen3-235B-A22B-Thinking-2507

Text Generation • Updated Aug 17, 2025 • 54k • • 398
Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

Text Generation • 235B • Updated Sep 17, 2025 • 759k • 146
Qwen/Qwen3-235B-A22B-Instruct-2507

Text Generation • Updated Sep 17, 2025 • 153k • • 763

Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

Text-to-Speech • Updated 28 days ago • 1.03M • 1.19k
Qwen/Qwen3-TTS-12Hz-0.6B-Base

Text-to-Speech • Updated 28 days ago • 243k • 177
Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign

Text-to-Speech • 2B • Updated 28 days ago • 387k • 278
Qwen/Qwen3-TTS-12Hz-1.7B-Base

Updated Jan 23 • 1.77M • 321

Qwen3-Coder-Next

Qwen/Qwen3-Coder-Next

Text Generation • Updated 22 days ago • 550k • • 994
Qwen/Qwen3-Coder-Next-FP8

Text Generation • 80B • Updated 22 days ago • 280k • • 91
Qwen/Qwen3-Coder-Next-Base

Text Generation • 80B • Updated 22 days ago • 4.53k • 59
Qwen/Qwen3-Coder-Next-GGUF

Text Generation • 80B • Updated 22 days ago • 63.7k • 180

Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method

about 14 hours ago

cerebras/Qwen3-Coder-REAP-363B-A35B-FP8

Text Generation • Updated Oct 14, 2025 • 41 • 15
cerebras/Qwen3-Coder-REAP-246B-A35B-FP8

Text Generation • 246B • Updated Oct 14, 2025 • 678 • 21
cerebras/Qwen3-Coder-REAP-363B-A35B

Text Generation • 363B • Updated Oct 30, 2025 • 16 • 5
cerebras/Qwen3-Coder-REAP-246B-A35B

Text Generation • 246B • Updated Oct 30, 2025 • 15 • 8

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104

facebook/dinov3-vit7b16-pretrain-lvd1689m

Image Feature Extraction • Updated Aug 19, 2025 • 20.1k • 214
facebook/dinov3-vits16-pretrain-lvd1689m

Image Feature Extraction • 21.6M • Updated Aug 19, 2025 • 110k • 67
facebook/dinov3-convnext-small-pretrain-lvd1689m

Image Feature Extraction • 49.5M • Updated Aug 19, 2025 • 31.6k • 22
facebook/dinov3-vitb16-pretrain-lvd1689m

Image Feature Extraction • 85.7M • Updated Aug 19, 2025 • 582k • 102

Nemotron-Terminal

We are releasing Nemotron-Terminal models and training datasets.

about 21 hours ago

nvidia/Nemotron-Terminal-8B

Text Generation • 8B • Updated about 21 hours ago • 3 • 5
nvidia/Nemotron-Terminal-14B

Text Generation • 15B • Updated about 21 hours ago • 3 • 2
nvidia/Nemotron-Terminal-32B

Text Generation • 33B • Updated about 21 hours ago • 6 • 16
nvidia/Nemotron-Terminal-Synthetic-Tasks

Updated 2 days ago • 4 • 3

Running

Featured

385

Qwen3 VL Demo

😻

385

Chat with an AI that understands text, images, and videos
Qwen/Qwen3-VL-235B-A22B-Thinking

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 2.76M • • 378
Qwen/Qwen3-VL-235B-A22B-Instruct

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 349k • • 370
Qwen/Qwen3-VL-235B-A22B-Thinking-FP8

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 25.9k • 27

Claude 4.5 Opus

Distilled models and datasets for Claude 4.5 Opus.

TeichAI/claude-4.5-opus-high-reasoning-250x

Viewer • Updated Nov 28, 2025 • 250 • 5.55k • 290
Running

24

Qwen3 Claude Opus

🚀

24

Chat with an AI for various inquiries
TeichAI/Nemotron-Cascade-14B-Thinking-Claude-4.5-Opus-High-Reasoning-Distill-GGUF

15B • Updated Dec 17, 2025 • 3.41k • 10
TeichAI/Nemotron-Cascade-14B-Thinking-Claude-4.5-Opus-High-Reasoning-Distill

Text Generation • Updated Dec 17, 2025 • 223 • 6

GPT-OSS-Swallow-v0.1

tokyotech-llm/GPT-OSS-Swallow-20B-RL-v0.1

Text Generation • 21B • Updated 5 days ago • 4.03k • 13
tokyotech-llm/GPT-OSS-Swallow-120B-RL-v0.1

Text Generation • 117B • Updated 5 days ago • 2.05k • 9
tokyotech-llm/GPT-OSS-Swallow-20B-SFT-v0.1

Text Generation • 21B • Updated 5 days ago • 1.97k • 5
tokyotech-llm/GPT-OSS-Swallow-120B-SFT-v0.1

Text Generation • 117B • Updated 5 days ago • 3.06k • 2

ColBERT-Zero 🐶

First large-scale fully pre-trained ColBERT model using only public data, outperforming GTE-ModernColBERT and GTE-ModernBERT

ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models

Paper • 2602.16609 • Published 7 days ago • 6
lightonai/ColBERT-Zero

Sentence Similarity • 0.1B • Updated 2 days ago • 722 • 24
lightonai/ColBERT-Zero-supervised

Sentence Similarity • 0.1B • Updated 2 days ago • 56 • 3
lightonai/ColBERT-Zero-unsupervised

Sentence Similarity • 0.1B • Updated 2 days ago • 34 • 1

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104

facebook/dinov3-vit7b16-pretrain-lvd1689m

Image Feature Extraction • Updated Aug 19, 2025 • 20.1k • 214
facebook/dinov3-vits16-pretrain-lvd1689m

Image Feature Extraction • 21.6M • Updated Aug 19, 2025 • 110k • 67
facebook/dinov3-convnext-small-pretrain-lvd1689m

Image Feature Extraction • 49.5M • Updated Aug 19, 2025 • 31.6k • 22
facebook/dinov3-vitb16-pretrain-lvd1689m

Image Feature Extraction • 85.7M • Updated Aug 19, 2025 • 582k • 102

Qwen/Qwen3-235B-A22B-Thinking-2507-FP8

Text Generation • 235B • Updated Jul 30, 2025 • 76.6k • 82
Qwen/Qwen3-235B-A22B-Thinking-2507

Text Generation • Updated Aug 17, 2025 • 54k • • 398
Qwen/Qwen3-235B-A22B-Instruct-2507-FP8

Text Generation • 235B • Updated Sep 17, 2025 • 759k • 146
Qwen/Qwen3-235B-A22B-Instruct-2507

Text Generation • Updated Sep 17, 2025 • 153k • • 763

Nemotron-Terminal

We are releasing Nemotron-Terminal models and training datasets.

about 21 hours ago

nvidia/Nemotron-Terminal-8B

Text Generation • 8B • Updated about 21 hours ago • 3 • 5
nvidia/Nemotron-Terminal-14B

Text Generation • 15B • Updated about 21 hours ago • 3 • 2
nvidia/Nemotron-Terminal-32B

Text Generation • 33B • Updated about 21 hours ago • 6 • 16
nvidia/Nemotron-Terminal-Synthetic-Tasks

Updated 2 days ago • 4 • 3

Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

Text-to-Speech • Updated 28 days ago • 1.03M • 1.19k
Qwen/Qwen3-TTS-12Hz-0.6B-Base

Text-to-Speech • Updated 28 days ago • 243k • 177
Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign

Text-to-Speech • 2B • Updated 28 days ago • 387k • 278
Qwen/Qwen3-TTS-12Hz-1.7B-Base

Updated Jan 23 • 1.77M • 321

Running

Featured

385

Qwen3 VL Demo

😻

385

Chat with an AI that understands text, images, and videos
Qwen/Qwen3-VL-235B-A22B-Thinking

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 2.76M • • 378
Qwen/Qwen3-VL-235B-A22B-Instruct

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 349k • • 370
Qwen/Qwen3-VL-235B-A22B-Thinking-FP8

Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 25.9k • 27

Qwen3-Coder-Next

Qwen/Qwen3-Coder-Next

Text Generation • Updated 22 days ago • 550k • • 994
Qwen/Qwen3-Coder-Next-FP8

Text Generation • 80B • Updated 22 days ago • 280k • • 91
Qwen/Qwen3-Coder-Next-Base

Text Generation • 80B • Updated 22 days ago • 4.53k • 59
Qwen/Qwen3-Coder-Next-GGUF

Text Generation • 80B • Updated 22 days ago • 63.7k • 180

Claude 4.5 Opus

Distilled models and datasets for Claude 4.5 Opus.

TeichAI/claude-4.5-opus-high-reasoning-250x

Viewer • Updated Nov 28, 2025 • 250 • 5.55k • 290
Running

24

Qwen3 Claude Opus

🚀

24

Chat with an AI for various inquiries
TeichAI/Nemotron-Cascade-14B-Thinking-Claude-4.5-Opus-High-Reasoning-Distill-GGUF

15B • Updated Dec 17, 2025 • 3.41k • 10
TeichAI/Nemotron-Cascade-14B-Thinking-Claude-4.5-Opus-High-Reasoning-Distill

Text Generation • Updated Dec 17, 2025 • 223 • 6

Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method

about 14 hours ago

cerebras/Qwen3-Coder-REAP-363B-A35B-FP8

Text Generation • Updated Oct 14, 2025 • 41 • 15
cerebras/Qwen3-Coder-REAP-246B-A35B-FP8

Text Generation • 246B • Updated Oct 14, 2025 • 678 • 21
cerebras/Qwen3-Coder-REAP-363B-A35B

Text Generation • 363B • Updated Oct 30, 2025 • 16 • 5
cerebras/Qwen3-Coder-REAP-246B-A35B

Text Generation • 246B • Updated Oct 30, 2025 • 15 • 8

GPT-OSS-Swallow-v0.1

tokyotech-llm/GPT-OSS-Swallow-20B-RL-v0.1

Text Generation • 21B • Updated 5 days ago • 4.03k • 13
tokyotech-llm/GPT-OSS-Swallow-120B-RL-v0.1

Text Generation • 117B • Updated 5 days ago • 2.05k • 9
tokyotech-llm/GPT-OSS-Swallow-20B-SFT-v0.1

Text Generation • 21B • Updated 5 days ago • 1.97k • 5
tokyotech-llm/GPT-OSS-Swallow-120B-SFT-v0.1

Text Generation • 117B • Updated 5 days ago • 3.06k • 2

Previous
1
2
3
4
...
18,549
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs