Xenova
banner
xenova.bsky.social
Xenova
@xenova.bsky.social
Bringing the power of machine learning to the web. Currently working on Transformers.js (@huggingface 🤗)
As always, the demo is open source (which you can find under the "Files" tab), so I'm excited to see how the community builds upon this! 🚀

🔗 Link to demo: huggingface.co/spaces/Liqui...
LFM2 WebGPU – In-browser tool calling - a Hugging Face Space by LiquidAI
In-browser tool calling, powered by Transformers.js
huggingface.co
August 6, 2025 at 5:56 PM
That's right, we're running Mistral's new Voxtral-Mini-3B model 100% locally in-browser on WebGPU, powered by Transformers.js and ONNX Runtime Web! 🔥

Try it out yourself! 👇
huggingface.co/spaces/webml...
Voxtral WebGPU - a Hugging Face Space by webml-community
State-of-the-art audio transcription in your browser
huggingface.co
July 24, 2025 at 3:43 PM
The most difficult part was getting the model running in the first place, but the next steps are simple:
✂️ Implement sentence splitting, enabling streamed responses
🌍 Multilingual support (only phonemization left)

Who wants to help? 🤗
huggingface.co/spaces/webml...
Kokoro Text-to-Speech (WebGPU) - a Hugging Face Space by webml-community
High-quality speech synthesis powered by Kokoro TTS
huggingface.co
February 7, 2025 at 5:03 PM
The model is also extremely resilient to quantization. The smallest variant is only 86 MB in size (down from the original 326 MB), with no noticeable difference in audio quality! 🤯

Link to models/samples: huggingface.co/onnx-communi...
January 16, 2025 at 3:05 PM
You can get started in just a few lines of code! 🧑‍💻

Huge kudos to the Kokoro TTS community, especially taylorchu for the ONNX exports and Hexgrad for the amazing project! None of this would be possible without you all! 🤗

Try it out yourself: huggingface.co/spaces/webml...
January 16, 2025 at 3:05 PM
For the AI builders out there: imagine what could be achieved with a browser extension that (1) uses a powerful reasoning LLM, (2) runs 100% locally & privately, and (3) can directly access/manipulate the DOM! 👀

💻 Source code: github.com/huggingface/...
🔗 Online demo: huggingface.co/spaces/webml...
Llama 3.2 Reasoning WebGPU - a Hugging Face Space by webml-community
Small and powerful reasoning LLM that runs in your browser
huggingface.co
January 10, 2025 at 12:19 PM
This project was greatly inspired by Brendan Bycroft's amazing LLM Visualization tool – check it out if you haven't already! Also, thanks to Niels Rogge for adding DINOv2 w/ Registers to transformers! 🤗

Source code: github.com/huggingface/...

Online demo: huggingface.co/spaces/webml...
Attention Visualization - a Hugging Face Space by webml-community
Vision Transformer Attention Visualization
huggingface.co
January 1, 2025 at 3:37 PM
Another interesting thing to see is how the attention maps become far more refined in later layers of the transformer. For example,

First layer (1) – noisy and diffuse, capturing broad general patterns.
Last layer (12) – focused and precise, highlighting specific features.
January 1, 2025 at 3:37 PM
Vision Transformers work by dividing images into fixed-size patches (e.g., 14 × 14), flattening each patch into a vector and treating each as a token.

It's fascinating to see what each attention head learns to "focus on". For example, layer 11, head 1 seems to identify eyes. Spooky! 👀
January 1, 2025 at 3:37 PM
The app loads a small DINOv2 model into the user's browser and runs it locally using Transformers.js! 🤗

This means you can analyze your own images for free: simply click the image to open the file dialog.

E.g., the model recognizes that long necks and fluffy ears are defining features of llamas! 🦙
January 1, 2025 at 3:37 PM
Yeah, I ran into this during development, and is unfortunately a bug in Firefox:
- bugzilla.mozilla.org/show_bug.cgi...
- bugzilla.mozilla.org/show_bug.cgi...
1725336 - Error in AudioContext.createMediaStreamSource when AudioContext is constructed with a custom sample rate
UNCONFIRMED (nobody) in Core - Web Audio. Last updated 2023-12-29.
bugzilla.mozilla.org
December 18, 2024 at 5:14 PM
Huge shout-out to the Useful Sensors team for such an amazing model and to Wael Yasmina for his 3D audio visualizer tutorial! 🤗

‍💻 Source code: github.com/huggingface/...
🔗 Online demo: huggingface.co/spaces/webml...
Moonshine Web - a Hugging Face Space by webml-community
Real-time in-browser speech recognition
huggingface.co
December 18, 2024 at 4:51 PM
Huge shout-out to OuteAI for their amazing model (OuteTTS-0.2-500M) and for helping us bring it to the web! 🤗 Together, we released the outetts NPM package, which you can install with `npm i outetts`.

💻 Source code: github.com/huggingface/...

🔗 Demo: huggingface.co/spaces/webml...
Text-to-Speech WebGPU - a Hugging Face Space by webml-community
WebGPU text-to-Speech powered by OuteTTS and Transformers.js
huggingface.co
December 8, 2024 at 7:38 PM
The model is multilingual (English, Chinese, Korean & Japanese) and even supports zero-shot voice cloning! 🤯 Stay tuned for an update that will add these features to the UI!

More samples:
bsky.app/profile/reac...
reach-vb.hf.co vb @reach-vb.hf.co · Nov 25
Smol TTS keeps getting better! Introducing OuteTTS v0.2 - 500M parameters, multilingual with voice cloning! 🔥

> Multilingual - English, Chinese, Korean & Japanese
> Cross platform inference w/ llama.cpp
> Trained on 5 Billion audio tokens
> Qwen 2.5 0.5B LLM backbone
> Trained via HF GPU grants
December 8, 2024 at 7:38 PM