Building on HF

5 19 89

Tyler Williams PRO

unmodeled-tyler

https://quantaintellect.com

AI & ML interests

AI research engineer & solo operator of VANTA Research/Quanta Intellect

Recent Activity

repliedto omarkamali's post about 9 hours ago

I just might have cracked tokenizer-free LLMs. No vocab, no softmax. I'm training a 22M params LLM rn to test this "thing" and it's able to formulate coherent sentences 🤯 Bear in mind, this is a completely new, tokenizer-free LLM architecture with built-in language universality. Check the explainer video to understand what's happening. Feedback welcome on this approach!

reacted to karstenskyt's post with 🔥 about 9 hours ago

🚀 𝗟𝗮𝘂𝗻𝗰𝗵𝗶𝗻𝗴 𝘁𝗵𝗲 𝗔𝗜/𝗠𝗟 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄𝘀 𝗗𝗮𝘀𝗵𝗯𝗼𝗮𝗿𝗱 Now that our Taipy architecture is humming along on Hugging Face Spaces, we just shipped the most complex feature of the (𝘙𝘪𝘨𝘩𝘵! 𝘓𝘶𝘹𝘶𝘳𝘺!) 𝘓𝘢𝘬𝘦𝘩𝘰𝘶𝘴𝘦 to date: the 𝗔𝗜/𝗠𝗟 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄𝘀 𝗗𝗮𝘀𝗵𝗯𝗼𝗮𝗿𝗱. Managing 16 different machine learning pipelines (from Expected Goals to Space Creation) across Databricks Serverless and HF Jobs is a logistical challenge. To solve this, we built a dynamic operations center (the 13th page in our app). It features: • 𝗔𝗻 𝗶𝗻𝘁𝗲𝗿𝗮𝗰𝘁𝗶𝘃𝗲 𝗱𝗲𝗽𝗲𝗻𝗱𝗲𝗻𝗰𝘆 𝗗𝗔𝗚: Powered by Cytoscape.js, it visually maps exactly how our models and data grids feed into each other. • 𝗥𝗲𝗮𝗹-𝘁𝗶𝗺𝗲 𝗺𝗼𝗻𝗶𝘁𝗼𝗿𝗶𝗻𝗴: Tracks run volumes and data freshness SLAs across the entire platform. • 𝗔 𝟯-𝘁𝗶𝗲𝗿 𝗵𝘆𝗯𝗿𝗶𝗱 𝗰𝗼𝘀𝘁 𝗲𝗻𝗴𝗶𝗻𝗲: Merges "cold" Databricks billing data with "warm/hot" live HF Jobs estimates to give a unified view of pipeline expenses. Check out the live interactive graph here: https://huggingface.co/spaces/luxury-lakehouse/soccer-analytics-app

posted an update about 20 hours ago

PSA: LiteLLM has been compromised on PyPI - if you have it installed, CHECK NOW. LiteLLM is used as a dependency in A LOT of AI tooling, so there's a pretty good chance that you have it installed somewhere on your machine (my instance was part of Hermes Agent, but I was unaffected by the hack) Versions 1.82.7 & 1.82.8 on PyPI have been compromised with a multi-stage credential stealer. - Version 1.82.8 uses a .pth file that executes on EVERY python process startup. You don't even need to import litellm. Just having it installed is enough. - The payload harvests SSH keys, .env files, AWS/GCP/Azure credentials, Kubernetes configs, database passwords, crytpo wallets, shell history - basically every secret on your machine. - Stolen data is encrypted with a hardcoded RSA key and exfiltrated to a domain that is NOT part of a legitimate litellm infrastructure. - If you're running Kubernetes, it attempts lateral movement across the entire cluster. - The C2 is hosted on the Internet Computer blockchain, making it essentially impossible to take down. This is part of a coordinated campaign by a threat actor called TeamPCP who have also hit Trivy (Aqua Security), Checkmarx KICS, and multiple npm packages in the last week ALONE. What to do: 1. Run 'pip show litellm' in every environment you have 2. If you're on 1.82.7 or 1.82.8 - rotate EVERY secret on that machine immediately. 3. Check for persistence artifacts ~/.config/sysmon/sysmon.py & ~/.config/systemd/user/sysmon.service I was lucky in this case that my litellm version was out of date, but if you've installed litellm as a dependency in ANY package within the last 24ish hours, you're gonna want to check. SOURCES https://futuresearch.ai/blog/litellm-pypi-supply-chain-attack/ Same group, different attack a couple of days ago: https://www.stepsecurity.io/blog/canisterworm-how-a-self-propagating-npm-worm-is-spreading-backdoors-across-the-ecosystem

View all activity

Organizations

repliedto omarkamali's post about 9 hours ago

Cool! any chance at sharing a repo so others can play around with it? I'd love to give it a try! 😀

reactedto karstenskyt's post with 🔥 about 9 hours ago

Post

🚀 𝗟𝗮𝘂𝗻𝗰𝗵𝗶𝗻𝗴 𝘁𝗵𝗲 𝗔𝗜/𝗠𝗟 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄𝘀 𝗗𝗮𝘀𝗵𝗯𝗼𝗮𝗿𝗱

Now that our Taipy architecture is humming along on Hugging Face Spaces, we just shipped the most complex feature of the (𝘙𝘪𝘨𝘩𝘵! 𝘓𝘶𝘹𝘶𝘳𝘺!) 𝘓𝘢𝘬𝘦𝘩𝘰𝘶𝘴𝘦 to date: the 𝗔𝗜/𝗠𝗟 𝗪𝗼𝗿𝗸𝗳𝗹𝗼𝘄𝘀 𝗗𝗮𝘀𝗵𝗯𝗼𝗮𝗿𝗱.

Managing 16 different machine learning pipelines (from Expected Goals to Space Creation) across Databricks Serverless and HF Jobs is a logistical challenge. To solve this, we built a dynamic operations center (the 13th page in our app).

It features:

  • 𝗔𝗻 𝗶𝗻𝘁𝗲𝗿𝗮𝗰𝘁𝗶𝘃𝗲 𝗱𝗲𝗽𝗲𝗻𝗱𝗲𝗻𝗰𝘆 𝗗𝗔𝗚: Powered by Cytoscape.js, it visually maps exactly how our models and data grids feed into each other.

  • 𝗥𝗲𝗮𝗹-𝘁𝗶𝗺𝗲 𝗺𝗼𝗻𝗶𝘁𝗼𝗿𝗶𝗻𝗴: Tracks run volumes and data freshness SLAs across the entire platform.

  • 𝗔 𝟯-𝘁𝗶𝗲𝗿 𝗵𝘆𝗯𝗿𝗶𝗱 𝗰𝗼𝘀𝘁 𝗲𝗻𝗴𝗶𝗻𝗲: Merges "cold" Databricks billing data with "warm/hot" live HF Jobs estimates to give a unified view of pipeline expenses.

Check out the live interactive graph here:
luxury-lakehouse/soccer-analytics-app

posted an update about 20 hours ago

Post

PSA: LiteLLM has been compromised on PyPI - if you have it installed, CHECK NOW.

LiteLLM is used as a dependency in A LOT of AI tooling, so there's a pretty good chance that you have it installed somewhere on your machine (my instance was part of Hermes Agent, but I was unaffected by the hack)

Versions 1.82.7 & 1.82.8 on PyPI have been compromised with a multi-stage credential stealer.

- Version 1.82.8 uses a .pth file that executes on EVERY python process startup. You don't even need to import litellm. Just having it installed is enough.
- The payload harvests SSH keys, .env files, AWS/GCP/Azure credentials, Kubernetes configs, database passwords, crytpo wallets, shell history - basically every secret on your machine.
- Stolen data is encrypted with a hardcoded RSA key and exfiltrated to a domain that is NOT part of a legitimate litellm infrastructure.
- If you're running Kubernetes, it attempts lateral movement across the entire cluster.
- The C2 is hosted on the Internet Computer blockchain, making it essentially impossible to take down.

This is part of a coordinated campaign by a threat actor called TeamPCP who have also hit Trivy (Aqua Security), Checkmarx KICS, and multiple npm packages in the last week ALONE.

What to do:

1. Run 'pip show litellm' in every environment you have
2. If you're on 1.82.7 or 1.82.8 - rotate EVERY secret on that machine immediately.
3. Check for persistence artifacts ~/.config/sysmon/sysmon.py & ~/.config/systemd/user/sysmon.service

I was lucky in this case that my litellm version was out of date, but if you've installed litellm as a dependency in ANY package within the last 24ish hours, you're gonna want to check.

SOURCES
https://futuresearch.ai/blog/litellm-pypi-supply-chain-attack/

Same group, different attack a couple of days ago: https://www.stepsecurity.io/blog/canisterworm-how-a-self-propagating-npm-worm-is-spreading-backdoors-across-the-ecosystem

5 replies

reactedto Shrijanagain's post with 🔥 2 days ago

Post

4850

We are thrilled to announce the launch of SKT-OMNI-CORPUS-146T-V1, a massive-scale, high-quality dataset designed to power the next generation of Foundation Models (LLMs) from scratch.
Developed at SKT AI LABS, this corpus is not just a collection of data; it’s a mission to decentralize high-grade AI training for regional languages and global knowledge.

💎 Key Highlights:

•• Massive Scale: Targeting a multi-terabyte architecture for 146T-level tokenization.

•• Pure Quality: Curated from 500+ Elite Sources

•• Structured for MoE: Perfectly sharded into 3.5GB standardized units (SKT-𝕻 series) for seamless distributed training.

🤝 Open for Collaboration!

We are looking for AI researchers, CUDA engineers, and data scientists to join us in this journey of building Project Surya and the ST-X Series models. Whether it's optimization, custom tokenization, or architecture design—let’s build the future together.

Explore the Dataset on Hugging Face:

🔗 Shrijanagain/SKT-OMNI-CORPUS-146T-V1

DSR -- 🔗 Shrijanagain/SKT-DSRx10000

#AI #MachineLearning #OpenSource #IndicAI #SKTAILABS #LLM #BigData #HuggingFace #InnovationIndia

repliedto their post 2 days ago

I was actually super excited to finally get a successful run out of this test - on this specific website there’s a sidebar popup that surfaces after the book is added to the cart essentially asking if you want to continue shopping or go to the cart.

Just about every architecture I tried completely missed the dialogue box and would just hit “add to basket” again which dismissed the popup, but subsequently duplicated the quantity. I spent just about an entire day on that problem so I was stoked to finally fix it 🦾

reactedto their post with 🚀 2 days ago

Post

3408

Here's a demo of Vessel Browser in action!

Minimax M2.7 was challenged with navigating to a large Ecom site, curate a selection of 5 different products, and add them all to the cart with included reasoning behind choices. (or try it yourself - open source, MIT license, and BYOK!)

npm i @quanta-intellect/vessel-browser

Vessel is a browser that I've been working on which is designed specifically for agents with human-in-the-loop visibility. It comes with a local MCP server allowing any harness that supports custom MCP to control the browser. Additionally, you can BYOK to 8+ different providers (including custom OAI compatible endpoints and local models).

One of my favorite features of the browser is persistent, bi-directional highlighting - meaning that both you AND the agent can highlight anything on the screen and the agent receives the context.

Vessel Browser is unique in that it surfaces available tools contextually to the agent, meaning the agent doesn't have to decide between 80+ tools at any given time, but rather is focused on a subset of tools most applicable to the current state.

Give it a try!

https://github.com/unmodeled-tyler/vessel-browser

2 replies

repliedto their post 3 days ago

There's certainly still some edges that need to be rounded out, but Vessel Browser is on a nightly release schedule so improvements are made daily!

posted an update 3 days ago

Post

3408

npm i @quanta-intellect/vessel-browser

2 replies

reactedto umarbutler's post with 🚀 3 days ago

Post

4764

Isaacus, the AI research company building legal superintelligence, is hiring!

We're looking for passionate engineers who love to build and tinker and want to have an impact on the world. Specifically, we're hiring:
• ML engineers (Australia).
• Data engineers (Australia).
• Full-stack engineers (Australia).
• DevRel engineers (Australia, San Francisco, and London).
• DevOps engineers (Australia, San Francisco, and London).

If you'd like to be a founding employee at one of the few VC-backed LLM research labs in the world, receive generous equity compensation, and work alongside other highly motivated, highly skilled engineers, get in touch: https://isaacus.com/careers

reactedto cahlen's post with 🔥 4 days ago

Post

2957

It’s wild to me how you can just make shit now.

You can take a weekend with a raspberry pi 5, a pi camera, a 3d printer, and a smidgen of custom fine tuning (wakeword, whisper, tinybert, and pipertts) and you have physical device as a talking personal assistant.

What a time to be alive.

Edge ai, physical ai, ai augmented animatronics… tiny models. Tiny agents.

Going to be a wild year.

5 replies

reactedto aufklarer's post with 🔥 4 days ago

Post

2076

We benchmarked https://github.com/soniqo/speech-swift, our open-source Swift library for on-device speech AI, against Whisper Large v3 (FP16) on LibriSpeech test-clean.

Three models beat it. Two architectural approaches:

Qwen3-ASR (LALM — Qwen3 LLM as ASR decoder, AuT encoder pretrained on ~40M hours) hits 2.35% WER at 1.7B 8-bit, running at 43x real-time on MLX. Greedy decoding matches beam search — the LLM decoder is strong enough that the greedy path is nearly always optimal.

Parakeet TDT (non-autoregressive transducer — FastConformer + TDT joint network) hits 2.74% WER in 634 MB as a CoreML INT8 model on the Neural Engine. No generative hallucination by design. Leaves GPU completely free.

Two findings worth flagging:
- 4-bit quantization is catastrophic for non-English: Korean 6.89% → 19.95% WER on FLEURS. Use 8-bit for multilingual.
- On CoreML, INT8 is 3.3x *faster* than INT4 — opposite of GPU behavior. Native ANE INT8 MACs vs INT4 lookup table indirection.

All numbers reproducible in 15 minutes.

Full article: https://blog.ivan.digital/we-beat-whisper-large-v3-with-a-600m-model-running-entirely-on-your-mac-20e6ce191174

Library: https://github.com/soniqo/speech-swift

Models: Qwen/Qwen3-ASR-0.6B, Qwen/Qwen3-ASR-1.7B, nvidia/parakeet-tdt-0.6b-v2

repliedto their post 8 days ago

Vessel has also been updated to include an integrated chat window with custom provider API endpoints! You can configure Vessel as an MCP server for your favorite tools, or use Vessel as your entire interface! :)

reactedto their post with 🚀 9 days ago

Post

5762

LINK: https://github.com/unmodeled-tyler/vessel-browser

Hey Hugging Face!

It's been quiet from me over here for the last few weeks, but I've been busy building! I just submitted my project to the Hermes Agent Hackathon, and wanted to share it with all of you.

This is Vessel Browser - an AI-native web browser that runs locally on Linux, and is operated by your personal AI agent via MCP server. Vessel is built from the ground up around the agent as first-class and visible UI for human-in-the-loop with 3 different levels of permissions.

Your agent finds, reads, and organizes the web for you, based on what you actually care about - not what a platform's algorithm thinks you care about.

Once your agent finds what it's looking for, it can organize bookmarked pages into custom folders with summaries for later browsing, take screenshots with highlighted text, and integrate with Obsidian for long-term browsing related-memory.

Check it out!

3 replies

posted an update 9 days ago

Post

5762

3 replies

reactedto alvdansen's post with 🚀 19 days ago

Post

1293

Releasing Flimmer today — a video LoRA training toolkit for WAN 2.1 and 2.2 that covers the full pipeline from raw footage to trained checkpoint.
The standout feature is phased training: multi-stage runs where each phase has its own learning rate, epochs, and dataset, with the checkpoint carrying forward automatically. Built specifically with WAN 2.2's dual-expert MoE architecture in mind.

Data prep tools are standalone and output standard formats — they work with any trainer, not just Flimmer.

Early release, building in the open. LTX support coming next.

http://github.com/alvdansen/flimmer-trainer

reactedto umarbutler's post with 🔥 19 days ago

Post

1937

This awesome visualization by @abdurrahmanbutler tracks how reliant the High Court of Australia has been on UK precedents over time.

Back in the early 1900s, up to 70% of citations in High Court decisions were from the UK. Today, that number sits around 20%.

This change seems to have happened gradually as Australia gained more and more independence from the UK, culminating in the Australia Acts of 1986, where we see a nice bump in the proportion of Australian cases cited.

These insights would not be possible without our latest legal AI model, Kanon 2 Enricher, which we used to extract dates and citations from High Court decisions in isaacus/open-australian-legal-corpus and categorize citations by jurisdiction. You can learn about Kanon 2 Enricher here: https://isaacus.com/blog/kanon-2-enricher.

repliedto their post 22 days ago

Yep, that’s the core of it! For example if I use the prompt “Paris is the capitol of France” and then I highlight “of” in my prompt, the layer predictions tab will show you what the model believes the next token to be through each layer.

You can watch the model start it’s guess in the very first layer (usually with something completely irrelevant) and then as it progress through each layer the model gets closer and closer until it converges on “France” as the most likely correct next token based on the context leading up to the selected token “of.” So then the model basically interprets it as “Paris is the capital of -> ? -> France”

You can see in the 1st layer the model was thinking “Paris is the capital of ales” then a deeper layer it was thinking “Paris is the capital of guardians” before it finally ended in the last layer with the correct prediction (remember, based on Paris is the capital of) “France”

The entropy tab calculates a few different metrics that also give a token-level and prompt-level hallucination risk assessment so you can see which types are higher risk for inducing hallucination in that particular model.

reactedto danielhanchen's post with 🔥 23 days ago

Post

5271

Qwen releases 4 new Qwen3.5 Small models: 0.8B • 2B • 4B • 9B!

Run Qwen3.5-0.8B, 2B and 4B on your phone. Run 9B on 6GB RAM.

The vision reasoning LLMs perform better than models 4x their size.

GGUFs to run: https://huggingface.co/collections/unsloth/qwen35

Guide: https://unsloth.ai/docs/models/qwen3.5

5 replies

reactedto their post with 🚀 23 days ago

Post

3022

Link to Repo: https://github.com/unmodeled-tyler/thought-tracer

I had a great time at Mistral's Hackathon in SF over the weekend! There were a lot of incredibly talented builders there and it was an honor to be a part of it! 😄

I built Thought Tracer - a TUI-based logitlens application for Ministral 3B/8B with optional AI analysis from Mistral Large on the Mistral API.

Thought Tracer allows you to see what the model "believes" at each layer until it arrives at it's final next token prediction. The Entropy tab displays entropy through each layer, additionally providing both token-level and prompt-level risk for hallucination.

If you have a Mistral API key, the AI analysis section is actually pretty cool because it's returned in rendered markdown and easily understandable language - providing a commentary on how the model likely arrived at it's final prediction, and also offering diagnostics for model developers. This commentary actually makes the tool pretty beginner friendly to anyone interested in exploring AI research tools for the first time.

Check it out if you're interested!

4 replies

repliedto their post 23 days ago

I'm actually going to continue building this out after the dust settles from the hackathon so expect more features!

Tyler Williams PRO

AI & ML interests

Recent Activity

Organizations

unmodeled-tyler's activity