Building on HF

Sergio Paniego PRO

sergiopaniego

Chunte's profile picture

Utkarsh736's profile picture

singerguo's profile picture

https://sergiopaniego.github.io/

sergiopaniego
sergiopaniego
sergio-paniego-blanco

AI & ML interests

None yet

Recent Activity

updated a dataset about 2 hours ago

agents-course/final-certificates

updated a dataset about 2 hours ago

agents-course/course-certificates-of-excellence

updated a dataset about 5 hours ago

huggingface-projects/Deep-RL-Course-Certification

View all activity

Organizations

sergiopaniego 's collections 9

Bringing Autonomous Driving RL to OpenEnv and TRL resources

Blog: https://huggingface.co/blog/sergiopaniego/bringing-carla-to-openenv-trl/

Sleeping

RL

CARLA Environment Server

🚗

Control a Carla driving simulation with custom actions
Runtime error

RL

CARLA Environment Server

🚗

Control a CARLA car simulation via custom actions
Running

Carla Grpo Trolley

🚀

Visualize your program’s I/O activity in real time
sergiopaniego/Qwen3-0.6B-carla-trolley-escape

0.8B • Updated 3 days ago • 40

Amazing design resources

Running

109

HFBA

🤗

109

A collection of Huggies!
Running

13

HF Thumbnail Crafter

🎨

13

Create custom thumbnails for your videos

GUI Grounding datasets

rootsautomation/ScreenSpot

Viewer • Updated Apr 10, 2024 • 1.27k • 1.16k • 44
OS-Copilot/OS-Atlas-data

Updated Dec 4, 2024 • 3.73k • 43

👁 Vision comparison ftw

Spaces to compare vision models — there’s no single best model, only the best one for your specific use case.

Running

41

comparevlms

🏃

41

Compare Vision Language Models
Sleeping

66

OCR Time Machine

📚

66

Extract text from images and XML files using OCR models
Running

26

Compare Docvqa Models

🦀

26

Compare different visual question answering
Running on CPU Upgrade

23

Compare Clip Siglip

🏃

23

Compare strong zero-shot image classification models

Vision Language Models: 2025 Update

This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update

Qwen/Qwen2.5-Omni-7B

Any-to-Any • Updated Apr 30, 2025 • 410k • 1.87k
Running

Featured

370

Qwen2.5 Omni 7B Demo

🏆

370

Chat with AI using text, audio, images, and video
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 170
openbmb/MiniCPM-o-2_6

Any-to-Any • Updated Oct 5, 2025 • 114k • 1.29k

📝 Research & Long-Form Blog Posts

In-depth technical articles and research pieces published by Hugging Face

Running

3.71k

The Ultra-Scale Playbook

🌌

3.71k

The ultimate guide to training LLM on large GPU Clusters
Running on CPU Upgrade

Featured

3.02k

The Smol Training Playbook

📚

3.02k

The secrets to building world-class LLMs
Running

278

Evaluation Guidebook

📝

278

Explore LLM benchmark trends over time
Running

218

FineVision: Open Data is All You Need

📝

218

A new open-source dataset for training VLMs

Vision reasoning datasets

deepcs233/Visual-CoT

Preview • Updated Mar 11, 2025 • 2.21k • 53
lmms-lab/multimodal-open-r1-8k-verified

Viewer • Updated Jan 27, 2025 • 7.69k • 2.57k • 71
leonardPKU/GEOQA_R1V_Train_8K

Viewer • Updated Feb 11, 2025 • 8.03k • 70 • 14
leonardPKU/clevr_cogen_a_train

Viewer • Updated Feb 2, 2025 • 70k • 242 • 40

My vision Spaces

Vision Spaces created by me

Running on Zero

Featured

113

VLM Object Understanding

🦀

113

Explore object detection, visual grounding, keypoint Detecti
Running on Zero

4

VQA Autonomous Driving SmolVLM2

🌖

4

Visual Question Answering - Autonomous Driving - SmolVLM2

😎 Awesome vision Spaces

Spaces where I've collaborated or that I consider unique!

Running

41

comparevlms

🏃

41

Compare Vision Language Models
Runtime error

4

Gemma3 License Plate Detection

📈

4

Gemma 3 for license plate detection
Running on Zero

Featured

142

Gemma 3n E4B It

⚡

142

Chat with a multimodal assistant for text, image, audio, video
Running on Zero

Featured

37

Moondream3

🏢

37

Image and video tasks with moondream3.

Bringing Autonomous Driving RL to OpenEnv and TRL resources

Blog: https://huggingface.co/blog/sergiopaniego/bringing-carla-to-openenv-trl/

Sleeping

RL

CARLA Environment Server

🚗

Control a Carla driving simulation with custom actions
Runtime error

RL

CARLA Environment Server

🚗

Control a CARLA car simulation via custom actions
Running

Carla Grpo Trolley

🚀

Visualize your program’s I/O activity in real time
sergiopaniego/Qwen3-0.6B-carla-trolley-escape

0.8B • Updated 3 days ago • 40

📝 Research & Long-Form Blog Posts

In-depth technical articles and research pieces published by Hugging Face

Running

3.71k

The Ultra-Scale Playbook

🌌

3.71k

The ultimate guide to training LLM on large GPU Clusters
Running on CPU Upgrade

Featured

3.02k

The Smol Training Playbook

📚

3.02k

The secrets to building world-class LLMs
Running

278

Evaluation Guidebook

📝

278

Explore LLM benchmark trends over time
Running

218

FineVision: Open Data is All You Need

📝

218

A new open-source dataset for training VLMs

Amazing design resources

Running

109

HFBA

🤗

109

A collection of Huggies!
Running

13

HF Thumbnail Crafter

🎨

13

Create custom thumbnails for your videos

Vision reasoning datasets

deepcs233/Visual-CoT

Preview • Updated Mar 11, 2025 • 2.21k • 53
lmms-lab/multimodal-open-r1-8k-verified

Viewer • Updated Jan 27, 2025 • 7.69k • 2.57k • 71
leonardPKU/GEOQA_R1V_Train_8K

Viewer • Updated Feb 11, 2025 • 8.03k • 70 • 14
leonardPKU/clevr_cogen_a_train

Viewer • Updated Feb 2, 2025 • 70k • 242 • 40

GUI Grounding datasets

rootsautomation/ScreenSpot

Viewer • Updated Apr 10, 2024 • 1.27k • 1.16k • 44
OS-Copilot/OS-Atlas-data

Updated Dec 4, 2024 • 3.73k • 43

My vision Spaces

Vision Spaces created by me

Running on Zero

Featured

113

VLM Object Understanding

🦀

113

Explore object detection, visual grounding, keypoint Detecti
Running on Zero

4

VQA Autonomous Driving SmolVLM2

🌖

4

Visual Question Answering - Autonomous Driving - SmolVLM2

👁 Vision comparison ftw

Spaces to compare vision models — there’s no single best model, only the best one for your specific use case.

Running

41

comparevlms

🏃

41

Compare Vision Language Models
Sleeping

66

OCR Time Machine

📚

66

Extract text from images and XML files using OCR models
Running

26

Compare Docvqa Models

🦀

26

Compare different visual question answering
Running on CPU Upgrade

23

Compare Clip Siglip

🏃

23

Compare strong zero-shot image classification models

😎 Awesome vision Spaces

Spaces where I've collaborated or that I consider unique!

Running

41

comparevlms

🏃

41

Compare Vision Language Models
Runtime error

4

Gemma3 License Plate Detection

📈

4

Gemma 3 for license plate detection
Running on Zero

Featured

142

Gemma 3n E4B It

⚡

142

Chat with a multimodal assistant for text, image, audio, video
Running on Zero

Featured

37

Moondream3

🏢

37

Image and video tasks with moondream3.

Vision Language Models: 2025 Update

This collection includes all the models, datasets and Spaces mentioned in the blog Vision Language Models: 2025 Update

Qwen/Qwen2.5-Omni-7B

Any-to-Any • Updated Apr 30, 2025 • 410k • 1.87k
Running

Featured

370

Qwen2.5 Omni 7B Demo

🏆

370

Chat with AI using text, audio, images, and video
Qwen2.5-Omni Technical Report

Paper • 2503.20215 • Published Mar 26, 2025 • 170
openbmb/MiniCPM-o-2_6

Any-to-Any • Updated Oct 5, 2025 • 114k • 1.29k

Sergio Paniego PRO

AI & ML interests

Recent Activity

Organizations

sergiopaniego 's collections 9

CARLA Environment Server

CARLA Environment Server

Carla Grpo Trolley

HFBA

HF Thumbnail Crafter

comparevlms

OCR Time Machine

Compare Docvqa Models

Compare Clip Siglip

Qwen2.5 Omni 7B Demo

The Ultra-Scale Playbook

The Smol Training Playbook

Evaluation Guidebook

FineVision: Open Data is All You Need

VLM Object Understanding

VQA Autonomous Driving SmolVLM2

comparevlms

Gemma3 License Plate Detection

Gemma 3n E4B It

Moondream3

CARLA Environment Server

CARLA Environment Server

Carla Grpo Trolley

The Ultra-Scale Playbook

The Smol Training Playbook

Evaluation Guidebook

FineVision: Open Data is All You Need

HFBA

HF Thumbnail Crafter

VLM Object Understanding

VQA Autonomous Driving SmolVLM2

comparevlms

OCR Time Machine

Compare Docvqa Models

Compare Clip Siglip

comparevlms

Gemma3 License Plate Detection

Gemma 3n E4B It

Moondream3

Qwen2.5 Omni 7B Demo