Open to Collab

s3nh PRO

s3nh

s3nhxx
s3nh

AI & ML interests

Quantization, LLMs, Deep Learning for good. Follow me if you like my work. Patreon.com/s3nh

Recent Activity

reacted to mitkox's post with 🚀 11 days ago

134,614 tok/sec input prefil max 1031 tokens/sec out gen max At these local AI speeds, there is no User Interface for humans. My human UI is the Radicle distributed Git issues queue On my GPU workstation: - Z8 Fury G5 4x A6000 - MiniMax-M2.5 - Claude Code to localhost:8000

liked a model 14 days ago

ysong21/entropy-v1-fp8

reacted to Tonic's post with 🔥 18 days ago

🙋🏻‍♂️hello my lovelies , it is with great pleasure i present to you my working one-click deploy 16GB ram completely free huggingface spaces deployment. repo : https://huggingface.co/spaces/Tonic/hugging-claw/tree/main (use git clone to inspect) literally the one-click link : https://huggingface.co/spaces/Tonic/hugging-claw?duplicate=true you can also run it locally and see for yourself : docker run -it -p 7860:7860 --platform=linux/amd64 \ -e HF_TOKEN="YOUR_VALUE_HERE" \ -e OPENCLAW_GATEWAY_TRUSTED_PROXIES="YOUR_VALUE_HERE" \ -e OPENCLAW_GATEWAY_PASSWORD="YOUR_VALUE_HERE" \ -e OPENCLAW_CONTROL_UI_ALLOWED_ORIGINS="YOUR_VALUE_HERE" \ registry.hf.space/tonic-hugging-claw:latest just a few quite minor details i'll take care of but i wanted to share here first

View all activity

Organizations

reacted to mitkox's post with 🚀 11 days ago

Post

282

134,614 tok/sec input prefil max
1031 tokens/sec out gen max

At these local AI speeds, there is no User Interface for humans. My human UI is the Radicle distributed Git issues queue

On my GPU workstation:
- Z8 Fury G5 4x A6000
- MiniMax-M2.5
- Claude Code to localhost:8000

1 reply

liked a model 14 days ago

ysong21/entropy-v1-fp8

Text Generation • 27B • Updated 17 days ago • 27 • 5

reacted to Tonic's post with 🔥 18 days ago

Post

3208

🙋🏻‍♂️hello my lovelies ,

it is with great pleasure i present to you my working one-click deploy 16GB ram completely free huggingface spaces deployment.

repo : Tonic/hugging-claw (use git clone to inspect)
literally the one-click link : Tonic/hugging-claw

you can also run it locally and see for yourself :

docker run -it -p 7860:7860 --platform=linux/amd64 \
-e HF_TOKEN="YOUR_VALUE_HERE" \
-e OPENCLAW_GATEWAY_TRUSTED_PROXIES="YOUR_VALUE_HERE" \
-e OPENCLAW_GATEWAY_PASSWORD="YOUR_VALUE_HERE" \
-e OPENCLAW_CONTROL_UI_ALLOWED_ORIGINS="YOUR_VALUE_HERE" \
registry.hf.space/tonic-hugging-claw:latest

just a few quite minor details i'll take care of but i wanted to share here first

2 replies

reacted to MonsterMMORPG's post with 🔥 20 days ago

Post

2977

SECourses Musubi Trainer upgraded to V27 and FLUX 2, FLUX Klein, Z-Image training added with demo configs - amazing VRAM optimized - read the news

App is here : https://www.patreon.com/posts/137551634

Full tutorial how to use and train : https://youtu.be/DPX3eBTuO_Y

1 reply

reacted to codelion's post with 🔥 21 days ago

Post

6128

Introducing Dhara-70M: A diffusion language model that achieves 3.8x higher throughput than autoregressive models!

Key findings from our research on optimal architectures for small language models:

→ Depth beats width: 32 layers outperforms 12 layers at the same parameter count
→ Best-in-class factuality: 47.5% on TruthfulQA
→ 10x training efficiency using WSD (Warmup-Stable-Decay) conversion
→ Canon layers add only 0.13% parameters but improve reasoning

We trained on 1B tokens using the optimal 50-30-20 dataset mix (PDFs + filtered web + educational content), then converted to diffusion with just 100M additional tokens.

Blog: https://huggingface.co/blog/codelion/optimal-model-architecture
Model: codelion/dhara-70m

1 reply

reacted to giux78's post with 🔥 27 days ago

Post

209

Together with @mferraretto and @efederici we released #Nesso-4B, a new model specialized for agentic workflows.

mii-llm/nesso-4B

#Nesso-4B is a fine-tuned version of Qwen-4B, trained on a highly curated and balanced dataset designed specifically for multilingual agentic workflows and conversational use cases.

As shown in the video below we simulate, the new “cowork” from #Antrophic, without any data sharing all running on a consumer device. The model can be used to build agentic behavior in #privateAI environments.

Not every problem requires super intelligence: in many cases, intelligence at the edge is more than enough.

#Nesso4B #AgenticAI #PrivateAI #EdgeAI #OnDeviceAI

2 replies

reacted to AdinaY's post with 🔥 27 days ago

Post

389

GLM just entered the OCR field🔥

zai-org/GLM-OCR

✨ 0.9B
✨ MIT licensed
✨ Multimodal GLM-V architecture
✨ #1 on OmniDocBench v1.5 (94.62)

liked a model 27 days ago

raincandy-u/Rain-v2

Text Generation • 0.1B • Updated Jan 30 • 97 • 4

reacted to raincandy-u's post with 🔥 27 days ago

Post

2978

Introducing Rain-v2: Democratizing LLM training on gaming GPUs! ⚡

Following Rain-100M, we’re scaling up. Rain-v2 features a larger training dataset.

We’ve published a comprehensive blog covering the end-to-end journey—from raw data collection to rigorous evaluation and safety testing.

HF Repo: 🤗 raincandy-u/Rain-v2

Blog: 📚
https://angelkawaii.xyz/2026/01/29/rain-v2/

Special thanks to the open-source community and the SmolLM2 team for their foundational work! 🚀

HuggingFaceTB
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model (2502.02737)

New activity in raincandy-u/Rain-100M about 1 month ago

Hardware used

#2 opened about 1 month ago by

s3nh

liked a model about 1 month ago

raincandy-u/Rain-100M

Text Generation • 97.2M • Updated Jan 24 • 135 • 18

reacted to raincandy-u's post with 👍🔥 about 1 month ago

Post

5467

🤗 Just released Rain-100M, an experimental ~97M-parameter Qwen3-style language model trained from random initialization.

Repo: raincandy-u/Rain-100M

Data: HuggingFaceFW/fineweb-edu, ~3B tokens, English only

Tokenizer: custom 16k BPE, context length 4096

Architecture: 12 Transformer layers, hidden size 768, 12 heads, MLP 2048, SiLU, bf16

Rain-100M is a raw base model (not instruction-tuned or safety-aligned), aimed at small-scale research, debugging training pipelines, and CPU/edge experiments. If you run evaluations, finetunes, or visualizations with it, I would be very interested in your results!