Finally, someone made GPT look good, Jackpot.

#1
by Trilogix1 - opened

This is a breaking point for the small models I believe. You just broke the bottleneck of the small and fast but ineffective LLM models.
I am wondering if you can apply it already to 0.3b-0.6b LFM2 or Qwen3 would be an excellent example with it (because of high training tokens originally).

Again great job.

Trilogix1 changed discussion status to closed

Sign up or log in to comment