jjokah (John Johnson)

upvoted a paper 7 days ago

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 201

commented on Small Language Models (SLM): A Comprehensive Overview 7 days ago

Here on Hugging Face (models section) use the Parameters filter and select the size range of the model you want.

commented on Small Language Models (SLM): A Comprehensive Overview 7 days ago

There has to be a trade-off here to really shrink the model size.
The model has to be scoped to specific tasks or domains rather than trying to be general-purpose.
And a good approach to achieve this is Knowledge Distillation: train a tiny student model on a specific task to mimic a larger teacher, but you lose generality. That's the trade-off.

upvoted an article 23 days ago

Article

One-Shot Any Web App with Gradio's gr.HTML

+1

24 days ago

•

25

upvoted an article about 1 month ago

Article

Introducing Daggr: Chain apps programmatically, inspect visually

+3

Jan 29

•

104

liked a model about 2 months ago

Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice

Text-to-Speech • 2B • Updated Jan 29 • 1.14M • 1.28k

liked a Space about 2 months ago

Qwen3-TTS Demo

🎙

1.68k

Generate custom speech from text, voice descriptions, or samples

liked a model about 2 months ago

Overworld/Waypoint-1-Small

Updated Jan 31 • 89 • 119

upvoted 2 collections about 2 months ago

Waypoint-1

Collection

The first real time diffusion world model designed for consumer hardware • 3 items • Updated Jan 30 • 8

TranslateGemma

Collection

3 items • Updated 1 day ago • 218

reacted to their post with 🔥 about 2 months ago

Post

1065

TranslateGemma: Open Translation Models (Jan 2026)

Google introduces TranslateGemma, a new suite of open translation models based on Gemma 3, available in 4B, 12B, and 27B parameter sizes.

Key Highlights:
• Supports 55 languages with high-quality translation across high-, mid-, and low-resource languages
• Exceptional efficiency: 12B model outperforms 27B baseline on WMT24++ benchmark
• Built using two-stage fine-tuning process distilling knowledge from Gemini models
• Retains strong multimodal capabilities (can translate text within images)
• Trained on nearly 500 additional language pairs for research adaptation
• Designed for diverse deployment environments from mobile to cloud

The models achieve state-of-the-art performance while maintaining exceptional efficiency, making high-quality translation accessible across different devices and use cases.

https://huggingface.co/collections/google/translategemma

posted an update about 2 months ago

Post

1065

TranslateGemma: Open Translation Models (Jan 2026)

Google introduces TranslateGemma, a new suite of open translation models based on Gemma 3, available in 4B, 12B, and 27B parameter sizes.

Key Highlights:
• Supports 55 languages with high-quality translation across high-, mid-, and low-resource languages
• Exceptional efficiency: 12B model outperforms 27B baseline on WMT24++ benchmark
• Built using two-stage fine-tuning process distilling knowledge from Gemini models
• Retains strong multimodal capabilities (can translate text within images)
• Trained on nearly 500 additional language pairs for research adaptation
• Designed for diverse deployment environments from mobile to cloud

The models achieve state-of-the-art performance while maintaining exceptional efficiency, making high-quality translation accessible across different devices and use cases.

https://huggingface.co/collections/google/translategemma