Lightnews — Scholar-powered news

@dailyzenntrends.bsky.social

今日のZennトレンド

WebLLM（Wasm上で動くLLM）は何が凄い？3種のLLM実行環境を徹底比較〜ローカルブラウザ型、ローカルネイティブ型、クラウド型〜
WebLLMは、WebAssemblyとWebGPUを活用し、ブラウザ内でLLMを直接実行する技術です。
これにより、プロンプトが外部サーバーに送信されない完全なプライバシー保護と、インストール不要の導入手軽さを両立します。
クラウド型やローカルネイティブ型と比較して、モデルサイズや出力品質に制約はあるものの、手軽にセキュアな環境を提供できる新しい選択肢として、今後の技術発展が期待されています。

WebLLM（Wasm上で動くLLM）は何が凄い？3種のLLM実行環境を徹底比較〜ローカルブラウザ型、ローカルネイティブ型、クラウド型〜

近年、プライバシー確保の重要性が高まり、ローカル環境で動作するLLMへの需要が増加しています。その選択肢の一つとしてWebLLMが台頭してきました。でも、プライバシー確保が目的ならOllamaでいいんじゃないの？この記事では、そんな疑問を深掘りしていきます！ WebLLMとはWebLLMは、Webブラウザ上で直接LLMを実行できるJavaScriptライブラリです。利用可能なLLMモデルが複数用意

zenn.dev

November 1, 2025 at 9:12 AM

Tech Trending

@tech-trending.bsky.social

WebLLM（Wasm上で動くLLM）は何が凄い？3種のLLM実行環境を徹底比較〜ローカルブラウザ型、ローカルネイティブ型、クラウド型〜
https://zenn.dev/srefin/articles/17ba278f402b5d

WebLLM（Wasm上で動くLLM）は何が凄い？3種のLLM実行環境を徹底比較〜ローカルブラウザ型、ローカルネイティブ型、クラウド型〜

zenn.dev

November 1, 2025 at 4:36 AM

Awakari

@bluesky.awakari.com

Show HN: WebLLM and WebGPU enabled LLM app – CodexLocal An LLM assistant runs entirely in your browser, all web-based. Comments URL: https://news.ycombinator.com/item?id=45520343 Points: 2 # Comments: 0

Interest | Match | Feed

Origin

codexlocal.com

October 8, 2025 at 8:55 PM

International JavaScript Conference

@javascriptcon.bsky.social

💡 Bring #GenerativeAI to your #Angular apps—locally & offline!
Join Christian Liebel’s full-day workshop at #iJS Munich on Oct 27 and code along with WebLLM & Prompt API.
👉 Don’t miss it!
🔗 https://f.mtr.cool/skpqsrpkii

September 30, 2025 at 9:41 AM

International JavaScript Conference

@javascriptcon.bsky.social

Join Christian Liebel in a workshop:
✨ Bring Generative AI into your #Angular apps with WebLLM & Prompt #API
✨ Build local, offline-capable #AI features like image generation & #chatbots
✨ Code along and leave with real, working prototypes!

👉 https://f.mtr.cool/chooavxuas

September 16, 2025 at 8:30 AM

Manu Gracio

@manugracio.bsky.social

This is a experimental project that leverages WebLLM to run a small language model (~1GB) entirely in the browser. This allows employees to chat with their department knowledge base without any server.

Just testing new ideas that come to my mind

September 9, 2025 at 11:13 AM

Awakari

@bluesky.awakari.com

WebGPU LLM: Build a No-Server, No-Keys Offline AI Chat in Your Browser (WebLLM Tutorial, GGUF… No Server, No API Keys, No Backend Continue reading on Medium »

Interest | Match | Feed

Origin

medium.com

September 7, 2025 at 5:04 PM

LLMs

@llms.activitypub.awakari.com.ap.brid.gy

WebGPU LLM: Build a No-Server, No-Keys Offline AI Chat in Your Browser (WebLLM Tutorial, GGUF… No Server, No API Keys, No Backend Continue reading on Medium »

#ai-on-device #technology #webllm #ai #programming

Origin | Interest | Match

WebGPU LLM: Build a No-Server, No-Keys Offline AI Chat in Your Browser (WebLLM Tutorial, GGUF…

No Server, No API Keys, No Backend

medium.com

September 7, 2025 at 5:04 PM

mozilla.ai

@mozilla.ai

What if AI agents ran entirely in your browser?

Baris Guler takes that idea seriously in our first-ever guest post on the Mozilla.ai blog:
🧱 WebLLM + WASM + WebWorkers
💻 Rust, Go, Python, JS
🔒 Fully local. No API calls.

Read the post here:
blog.mozilla.ai/3w-for-in-br...

3W for In-Browser AI: WebLLM + WASM + WebWorkers

🤝This is the first "guest post" in Mozilla.ai's blog (congratulations Baris!). His experiment, built upon the ideas of Mozilla.ai’s WASM agents blueprint, extends the concept of in-browser agents with...

blog.mozilla.ai

August 29, 2025 at 3:35 AM

Kristoffer Ørum

@oerum.org

I've made a local AI chatbot that runs entirely in your browser on your device
It's a chatbot pretending to be a human pretending to be an artist pretending to be a crayfish. No data gets sent anywhere - everything happens locally using Phi-3.5-mini in your browser using WebLLM technology.

Projects | Kristoffer Ørum

portiofiol of artist Kristoffer Ørum

oerum.org

August 26, 2025 at 5:33 PM

Mihailiк ᵖᵒ⁶ᵒᵐᵃ

@oyin.bo

Trying to load WebLLM on the phone?

tty.wtf#%23+Loading+...

$/** @type {import('@mlc-ai/web-llm')} */ import { CreateMLCEngine, prebuiltAppConfig } from "https://esm.run/@mlc-ai/web-llm"; //$ %^ $ %^; // This async function will now find the best available Gemma model and run a prompt async function* findAndRunGemma() { yield "Searching for available Gemma models..."; // 1. Get the list of all available models const availableModels = prebuiltAppConfig.model_list; yield "All available models:"; yield JSON.stringify(availableModels); // 2. Find the closest match to a Gemma model (preferring smaller ones) let selectedModelId = ""; const gemmaModels = availableModels.filter(m => m.model_id.toLowerCase().includes("gemma")); if(gemmaModels.length > 0) { // Prefer a smaller model like 2B if available const gemma2B = gemmaModels.find(m => m.model_id.includes("2b")); selectedModelId = gemma2B ? gemma2B.model_id : gemmaModels[0].model_id; // Pick 2b or the first one found yield `Found a Gemma model! Selecting: ${selectedModelId}`; } else { // 3. If no Gemma model is found, fall back to a default small model selectedModelId = "Llama-3-8B-Instruct-q4f16_1-MLC"; // A reliable alternative console.warn(`No Gemma model found in the pre-built list. Falling back to: ${selectedModelId}`); } // 4. Load the selected model and run the prompt yield `Loading model: ${selectedModelId}...`; let progressResolve; let progressPromise = new Promise(resolve => progressResolve = resolve); const enginePromise = CreateMLCEngine(selectedModelId, { initProgressCallback: (progress) => { // yield `Loading: ${progress.text}`; progressResolve(`Loading: ${progress.text}`); progressPromise = new Promise(resolve => progressResolve = resolve); } }); let engine; let engineError; enginePromise.then( e => { engine = e; progressResolve('Loaded'); }, err => { engineError = err; progressResolve('Error'); }); while(true) { const report …$

August 20, 2025 at 8:05 AM

Matt Sandy

@mattsandy.bsky.social

Between #WASM, #DuckDB, #WebLLM, and some amazing progress on SQL specific LLMs, this could be an interesting era of lots of questions answered, and some answered correctly. Context truly is king.

July 7, 2025 at 11:21 PM

newnakashima

@newnakashima.bsky.social

ローカルLLMの時代来るなこれは。ネット通信せずにスマホ内のモデルとエロいチャットできるWebLLMアプリそろそろ出てきそうというか既にあるかも。やがてスマホ内でエロい画像やエロい音声生成できるモデルが出てくるのでは。そうなったら文明もAIも次のステージに移行する気がする。エロ動画がインターネットを変えたように

June 27, 2025 at 11:37 AM

Rémy

@xowap.dev

WebLLM* when?

* As part of the browser's API, obviously

Tim Kellogg @timkellogg.me · May 20

Gemma 3n: the 4b LLM that’s up with sonnet-3.7 in chatbot arena

the new innovation is Per-Layer Embeddings, which let it consume dramatically less memory

it was created for phones, and is being rolled out to Android phones soon

developers.googleblog.com/en/introduci...

Bar chart titled “Chatbot Arena Elo Score” compares five chatbot models ranked by performance. Each vertical bar represents a model and its Elo score:
• Claude 3.7 Sonnet scores highest at 1287 (Proprietary).
• Gemma 3n is second with 1283, shown in a glowing blue gradient bar. A note clarifies it is a 4B model, with 1.4B active parameters using PLE caching and a total of 7B parameters.
• GPT-4.1-nano-2025-04-14 follows with 1268 (Proprietary).
• Llama-4-Maverick-17B-128E-Instruct has a score of 1266, with 17B parameters and a MoE (Mixture-of-Experts) configuration.
• Phi 4 ranks last at 1202 and is listed with 14B parameters.

Footnote at bottom notes scores are as of May 19, 2025, with Gemma 3n’s confidence interval listed as ±11.

June 26, 2025 at 10:54 PM

arXiv cs.CR Cryptography and Security

@cscr-bot.bsky.social

that leverages zero-shot inference by a local large language model (LLM) running entirely in-browser. Our system uses a compact LLM (e.g., 3B/8B parameters) via WebLLM to perform reasoning over rich context collected from the target webpage, including [3/8 of https://arxiv.org/abs/2506.03656v1]

June 5, 2025 at 5:55 AM

Jeremy Tuloup

@jtp.io

Shared some Jupyter Frontends demos last week at Jupyter Open Studio Day (hosted by Bloomberg)!

Covered:
✨ JupyterLab 4.4, Notebook 7.4
🧪 JupyterLite 0.6
🌐 In-browser Python/R
🖥️ Terminal w/ Vim
🧠 AI (WebLLM, on-device)
⚡ Hybrid kernels

Thanks Bloomberg & all who joined!

youtu.be/7kS_xfKEOmM

Jupyter Frontends & JupyterLite updates | Jupyter Open Studio Day 2025

YouTube video by Jeremy Tuloup

youtu.be

May 28, 2025 at 2:50 PM

🤷 Nico Martin

@nico.dev

Thats pretty wild😳. I just tried the new #Qwen3 0.6B Model with #WebLLM for structured-output tool-calling. And it works surprisingly good. With and without "thinking" enabled.
The model ist just 335MB🤯

May 5, 2025 at 5:29 AM

Tech Trending

@tech-trending.bsky.social

WebLLMを使ってブラウザ完結かつローカルLLMでFunction callingを試してみる
https://zenn.dev/tesla/articles/df51e31d54f834

WebLLMを使ってブラウザ完結かつローカルLLMでFunction callingを試してみる

zenn.dev

April 25, 2025 at 3:22 AM

Web Directions

@webdirections.org

Imagine deploying an LLM with zero server costs. Browser-based AI is here with WebLLM, ONNX.js, and Gemini Nano. Learn the tradeoffs and techniques for bringing sophisticated AI directly to your web applications.

From @ryanseddon.com at Code 25 bit.ly/3DpL8I1

Banner for conference presentation. Text reads Ryan Seddon The Rise of On-Device AI: What's Next for LLMs in the Browser? Web Directions Code 25 Melbourne June 2025

April 14, 2025 at 10:52 PM

Andy Wingo

@wingo.mastodon.social.ap.brid.gy

llm's are truly amazing technology

screenshot of an interaction with webllm, but it is just a webgpu-not-found error

March 24, 2025 at 2:56 PM

Web Directions

@webdirections.org

Imagine deploying an LLM with zero server costs. Browser-based AI is here with WebLLM, ONNX.js, and Gemini Nano. Learn the tradeoffs and techniques for bringing sophisticated AI directly to your web applications. At Code 25 we'll hear from @ryanseddon.com

Web Directions Code 2024

With models like Gemini Nano running entirely on-device, the web is entering a new era of AI-powered applications that don’t require servers.

bit.ly

March 20, 2025 at 12:48 AM

Gisela Guajardo

@gigrdo.bsky.social

Show HN: In-Browser Graph RAG with Kuzu-WASM and WebLLM
A new way to run Retrieval-Augmented Generation (RAG) models in the browser, leveraging WebAssembly and LLMs. This could make AI-powered apps faster and more efficient.
🔗 blog.kuzudb.com/post/kuzu-wa...
#WebAssembly #AI #LLM

Blog - Kuzu

Kuzu is a highly scalable, extremely fast, easy-to-use embeddable graph database

blog.kuzudb.com

March 13, 2025 at 5:10 PM

mikronews

@calculito.bsky.social

READ PS 📖 new docs about AI, LLM and NLP: Show HN: In-Browser Graph RAG with Kuzu-WASM and WebLLM https://blog.kuzudb.com/post/kuzu-wasm-rag/ #technews #NLP #AI #TechInnovation #techdocs #Alonso #autoScrap

March 11, 2025 at 1:43 PM

topickapp (IT技術系ニュースサイト)

@topickapp.bsky.social

https://blog.kuzudb.com/post/kuzu-wasm-rag/
Kuzu-WasmとWebLLMを使用して、ブラウザ内でグラフRAGを構築する事例を紹介しています。
LinkedInデータに対する質問に答えるチャットボットを、グラフデータベースとLLMを組み合わせて構築しています。
WebAssemblyの進化により、ブラウザ内で高度なAIアプリケーションが実現可能になったことを示唆しています。

Fully In-Browser Graph RAG with Kuzu-Wasm

We demonstrate a fully in-browser Graph RAG-based chatbot that uses Kuzu-Wasm and WebLLM. The chatbot answers natural language questions over your LinkedIn data. This post highlights the potential of fully-local knowledge graph-powered AI applications.

blog.kuzudb.com

March 11, 2025 at 12:12 PM

Ryan He (帳號已遷移到其他站台)

@ryanhe.pastwind.top.ap.brid.gy

展示 HN：瀏覽器內圖形 RAG 搭配 Kuzu-WASM 與 WebLLM https://blog.kuzudb.com/post/kuzu-wasm-rag/

kuzu-wasm…我覺得這名字取的真的不行

March 11, 2025 at 2:42 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news