Wietse Venema
banner
wietsevenema.eu
Wietse Venema
@wietsevenema.eu
Engineer at Google. O'Reilly author.
Deploying and managing open models with Hugging Face TGI on GKE is great, but how to autoscale? In this video with @boredabdel.bsky.social, we show you how to use TGI queue size as a scaling signal.

Video 👉 www.youtube.com/watch?v=QjLZ...

Playlist 👉 www.youtube.com/playlist?lis...
November 26, 2024 at 8:03 PM
I'm really looking forward to DevFest Berlin this weekend. Run open LLMs like Gemma & Llama on Google Cloud Run with Hugging Face TGI & TRL for cost-effective inference. Still a few tickets available! gdg.community.dev/events/detai...
November 19, 2024 at 2:28 PM