Lightnews — Scholar-powered news

Xenova

@xenova.bsky.social

As always, the demo is open source (which you can find under the "Files" tab), so I'm excited to see how the community builds upon this! 🚀

🔗 Link to demo: huggingface.co/spaces/Liqui...

LFM2 WebGPU – In-browser tool calling - a Hugging Face Space by LiquidAI

In-browser tool calling, powered by Transformers.js

huggingface.co

August 6, 2025 at 5:56 PM

Xenova

@xenova.bsky.social

That's right, we're running Mistral's new Voxtral-Mini-3B model 100% locally in-browser on WebGPU, powered by Transformers.js and ONNX Runtime Web! 🔥

Try it out yourself! 👇
huggingface.co/spaces/webml...

Voxtral WebGPU - a Hugging Face Space by webml-community

State-of-the-art audio transcription in your browser

huggingface.co

July 24, 2025 at 3:43 PM

Xenova

@xenova.bsky.social

Model: huggingface.co/lazy-guy12/c...

Online demo: lazy-guy.github.io/chess-llama/

lazy-guy12/chess-llama · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

July 22, 2025 at 7:00 PM

Xenova

@xenova.bsky.social

The most difficult part was getting the model running in the first place, but the next steps are simple:
✂️ Implement sentence splitting, enabling streamed responses
🌍 Multilingual support (only phonemization left)

Who wants to help? 🤗
huggingface.co/spaces/webml...

Kokoro Text-to-Speech (WebGPU) - a Hugging Face Space by webml-community

High-quality speech synthesis powered by Kokoro TTS

huggingface.co

February 7, 2025 at 5:03 PM

Xenova

@xenova.bsky.social

The model is also extremely resilient to quantization. The smallest variant is only 86 MB in size (down from the original 326 MB), with no noticeable difference in audio quality! 🤯

Link to models/samples: huggingface.co/onnx-communi...

January 16, 2025 at 3:05 PM

Xenova

@xenova.bsky.social

You can get started in just a few lines of code! 🧑‍💻

Huge kudos to the Kokoro TTS community, especially taylorchu for the ONNX exports and Hexgrad for the amazing project! None of this would be possible without you all! 🤗

Try it out yourself: huggingface.co/spaces/webml...

$import { KokoroTTS } from "kokoro-js"; const tts = await KokoroTTS.from_pretrained( "onnx-community/Kokoro-82M-ONNX", { dtype: "q8" }, // fp32, fp16, q8, q4, q4f16 ); const text = "Life is like a box of chocolates. You never know what you're gonna get."; const audio = await tts.generate(text, { voice: "af_sky" }, // See `tts.list_voices()` ); audio.save("audio.wav");$

January 16, 2025 at 3:05 PM

Xenova

@xenova.bsky.social

For the AI builders out there: imagine what could be achieved with a browser extension that (1) uses a powerful reasoning LLM, (2) runs 100% locally & privately, and (3) can directly access/manipulate the DOM! 👀

💻 Source code: github.com/huggingface/...
🔗 Online demo: huggingface.co/spaces/webml...

Llama 3.2 Reasoning WebGPU - a Hugging Face Space by webml-community

Small and powerful reasoning LLM that runs in your browser

huggingface.co

January 10, 2025 at 12:19 PM

Xenova

@xenova.bsky.social

This project was greatly inspired by Brendan Bycroft's amazing LLM Visualization tool – check it out if you haven't already! Also, thanks to Niels Rogge for adding DINOv2 w/ Registers to transformers! 🤗

Source code: github.com/huggingface/...

Online demo: huggingface.co/spaces/webml...

Attention Visualization - a Hugging Face Space by webml-community

Vision Transformer Attention Visualization

huggingface.co

January 1, 2025 at 3:37 PM

Xenova

@xenova.bsky.social

Another interesting thing to see is how the attention maps become far more refined in later layers of the transformer. For example,

First layer (1) – noisy and diffuse, capturing broad general patterns.
Last layer (12) – focused and precise, highlighting specific features.

January 1, 2025 at 3:37 PM

Xenova

@xenova.bsky.social

Vision Transformers work by dividing images into fixed-size patches (e.g., 14 × 14), flattening each patch into a vector and treating each as a token.

It's fascinating to see what each attention head learns to "focus on". For example, layer 11, head 1 seems to identify eyes. Spooky! 👀

January 1, 2025 at 3:37 PM

Xenova

@xenova.bsky.social

The app loads a small DINOv2 model into the user's browser and runs it locally using Transformers.js! 🤗

This means you can analyze your own images for free: simply click the image to open the file dialog.

E.g., the model recognizes that long necks and fluffy ears are defining features of llamas! 🦙

January 1, 2025 at 3:37 PM

Xenova

@xenova.bsky.social

Yeah, I ran into this during development, and is unfortunately a bug in Firefox:
- bugzilla.mozilla.org/show_bug.cgi...
- bugzilla.mozilla.org/show_bug.cgi...

1725336 - Error in AudioContext.createMediaStreamSource when AudioContext is constructed with a custom sample rate

UNCONFIRMED (nobody) in Core - Web Audio. Last updated 2023-12-29.

bugzilla.mozilla.org

December 18, 2024 at 5:14 PM

Xenova

@xenova.bsky.social

Huge shout-out to the Useful Sensors team for such an amazing model and to Wael Yasmina for his 3D audio visualizer tutorial! 🤗

‍💻 Source code: github.com/huggingface/...
🔗 Online demo: huggingface.co/spaces/webml...

Moonshine Web - a Hugging Face Space by webml-community

Real-time in-browser speech recognition

huggingface.co

December 18, 2024 at 4:51 PM

Xenova

@xenova.bsky.social

Huge shout-out to OuteAI for their amazing model (OuteTTS-0.2-500M) and for helping us bring it to the web! 🤗 Together, we released the outetts NPM package, which you can install with `npm i outetts`.

💻 Source code: github.com/huggingface/...

🔗 Demo: huggingface.co/spaces/webml...

Text-to-Speech WebGPU - a Hugging Face Space by webml-community

WebGPU text-to-Speech powered by OuteTTS and Transformers.js

huggingface.co

December 8, 2024 at 7:38 PM

Xenova

@xenova.bsky.social

The model is multilingual (English, Chinese, Korean & Japanese) and even supports zero-shot voice cloning! 🤯 Stay tuned for an update that will add these features to the UI!

More samples:
bsky.app/profile/reac...

vb @reach-vb.hf.co · Nov 25

Smol TTS keeps getting better! Introducing OuteTTS v0.2 - 500M parameters, multilingual with voice cloning! 🔥

> Multilingual - English, Chinese, Korean & Japanese
> Cross platform inference w/ llama.cpp
> Trained on 5 Billion audio tokens
> Qwen 2.5 0.5B LLM backbone
> Trained via HF GPU grants

December 8, 2024 at 7:38 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news