#webllm
Webllm could do it for free... But I'm not sure about the delay 😋 results could be changed but still.
November 30, 2024 at 1:17 PM
As always, I had a great time at DWX! Here are my slides from my talk about local, offline-capable #GenerativeAI models via #WebLLM, #PromptAPI, #WebGPU, and #WebNN: www.thinktecture.com/contribution... #pwa #dwx2024
Generative-AI-Power im Web: Progressive Web Apps smarter machen
www.thinktecture.com
July 4, 2024 at 11:21 AM
Show HN: In-Browser Graph RAG with Kuzu-WASM and WebLLM
L: https://blog.kuzudb.com/post/kuzu-wasm-rag/
C: https://news.ycombinator.com/item?id=43321523
posted on 2025.03.10 at 11:12:57 (c=0, p=5)
March 10, 2025 at 3:48 PM
Show HN: In-Browser Graph RAG with Kuzu-WASM and WebLLM (blog.kuzudb.com)

Discussion | Main Link
March 11, 2025 at 12:30 AM
READ PS 📖 new docs about AI, LLM and NLP: Show HN: In-Browser Graph RAG with Kuzu-WASM and WebLLM https://blog.kuzudb.com/post/kuzu-wasm-rag/ #technews #NLP #AI #TechInnovation #techdocs #Alonso #autoScrap
March 11, 2025 at 1:43 PM
This is insane! Structured generation in the browser with the new @hf.co SmolLM2-1.7B model

• Tiny 1.7B LLM running at 88 tokens / second ⚡
• Powered by MLC/WebLLM on WebGPU 🔥
• JSON Structured Generation entirely in the browser 🤏
November 29, 2024 at 11:18 AM
Check out my three-part article series about the benefits of on-device large language models and learn how to add AI capabilities to your web apps: web.dev/articles/ai-... #GenAI #WebLLM #PromptAPI
Benefits and limits of large language models  |  Articles  |  web.dev
web.dev
January 13, 2025 at 4:38 PM
⚡ Hackernews Top story: Show HN: In-Browser Graph RAG with Kuzu-WASM and WebLLM
Fully In-Browser Graph RAG with Kuzu-Wasm
We demonstrate a fully in-browser Graph RAG-based chatbot that uses Kuzu-Wasm and WebLLM. The chatbot answers natural language questions over your LinkedIn data. This post highlights the potential of fully-local knowledge graph-powered AI applications.
blog.kuzudb.com
March 10, 2025 at 11:36 PM
Running a Large Language Model fully offline in the browser? Yes, it’s real! No cloud, no API keys. Just WebLLM. 🤯
GitHub - mlc-ai/web-llm: High-performance In-browser LLM Inference Engine
High-performance In-browser LLM Inference Engine . Contribute to mlc-ai/web-llm development by creating an account on GitHub.
github.com
February 3, 2025 at 5:04 PM
🚀 Excited about the potential of Generative AI in the browser? I recently experimented with 𝗪𝗲𝗯𝗟𝗟𝗠, a library that lets you run #AI models right in your browser using WebGPU. It's impressive and hasn't crashed yet, even on my old MacBook Pro! 💻

#aiTech #buildInPublic #webDev - 1/3
January 14, 2025 at 12:19 PM
llm's are truly amazing technology
March 24, 2025 at 2:56 PM
Join Christian Liebel in a workshop:
✨ Bring Generative AI into your #Angular apps with WebLLM & Prompt #API
✨ Build local, offline-capable #AI features like image generation & #chatbots
✨ Code along and leave with real, working prototypes!

👉 https://f.mtr.cool/chooavxuas
September 16, 2025 at 8:30 AM
demo webpage, use the npm package, and follow the documentation to build their own web applications. The project is a companion to MLC LLM, which runs LLMs natively on various platforms. WebLLM offers various functionalities like streaming, JSON mode, function calling, and more. It supports (2/3)
May 5, 2024 at 10:51 PM
ICYMI: Local chatbot capabilities arrive for web applications, reducing data privacy concerns #Chatbot #DataPrivacy #WebApplications #WebLLM #BrowserAPIs
Local chatbot capabilities arrive for web applications, reducing data privacy concerns
WebLLM and new browser APIs enable offline-capable chatbots that keep sensitive data on users' devices while maintaining functionality.
ppc.land
January 16, 2025 at 9:56 PM
📝 Summary:

Secret Llama is a fully private chatbot that runs in the browser, supporting Llama 3, Mistral, and other open source models. It does not require a server or installation, works offline, and has an easy-to-use interface. It utilizes the webllm inference engine and requires a modern (1/2)
May 6, 2024 at 12:51 PM
READ HN 📖 news about AI, LLM and NLP: Show HN: In-Browser Graph RAG with Kuzu-WASM and WebLLM https://blog.kuzudb.com/post/kuzu-wasm-rag/ #technews #NLP #genAI #notebookLM
March 10, 2025 at 6:17 PM
WebGPU LLM: Build a No-Server, No-Keys Offline AI Chat in Your Browser (WebLLM Tutorial, GGUF… No Server, No API Keys, No Backend Continue reading on Medium »

#ai-on-device #technology #webllm #ai #programming

Origin | Interest | Match
WebGPU LLM: Build a No-Server, No-Keys Offline AI Chat in Your Browser (WebLLM Tutorial, GGUF…
No Server, No API Keys, No Backend
medium.com
September 7, 2025 at 5:04 PM
Local chatbot capabilities arrive for web applications, reducing data privacy concerns #Chatbots #DataPrivacy #WebApplications #WebLLM #BrowserAPIs
Local chatbot capabilities arrive for web applications, reducing data privacy concerns
WebLLM and new browser APIs enable offline-capable chatbots that keep sensitive data on users' devices while maintaining functionality.
ppc.land
January 13, 2025 at 9:53 PM
ICYMI: Local chatbot capabilities arrive for web applications, reducing data privacy concerns: WebLLM and new browser APIs enable offline-capable chatbots that keep sensitive data on users' devices while maintaining functionality. #Chatbot #DataPrivacy #WebApplications #WebLLM #BrowserAPIs
Local chatbot capabilities arrive for web applications, reducing data privacy concerns
WebLLM and new browser APIs enable offline-capable chatbots that keep sensitive data on users' devices while maintaining functionality.
ppc.land
January 16, 2025 at 9:56 PM
are there any applications out there using
#WebLLM?
March 15, 2024 at 8:48 AM
📝 Summary:

WebLLM is a customizable JavaScript package that brings language model chats to web browsers with hardware acceleration, running without server support and compatible with the OpenAI API. It can be used to build AI assistants, offers privacy with GPU acceleration, and can be (1/3)
May 21, 2024 at 11:51 PM
Thats pretty wild😳. I just tried the new #Qwen3 0.6B Model with #WebLLM for structured-output tool-calling. And it works surprisingly good. With and without "thinking" enabled.
The model ist just 335MB🤯
May 5, 2025 at 5:29 AM