Alvaro Bartolome
alvarobartt.bsky.social
Alvaro Bartolome
@alvarobartt.bsky.social
machine learning @hf.co
For anyone interested in Zig I wrote a small post titled "How to read and parse JSON with Zig 0.13" that explains how to read JSON from a file with keys with different value types and how to access those values.
February 10, 2025 at 4:15 PM
🤗 Here's a simple script that calculates the required VRAM for serving DeepSeek R1 from @huggingface Hub safetensor's metadata!

P.S. The result of the script above is: "model_id='deepseek-ai/DeepSeek-R1' requires memory=756.716GB"
January 31, 2025 at 4:04 PM
🐐 DeepSeek is not on the @hf.co Hub to take part, they are there to take over!

Amazing stuff from the DeepSeek team, ICYMI they recently released some reasoning models (DeepSeek-R1 and DeepSeek-R1-Zero), fully open-source, their performance is on par with OpenAI-o1 and it's MIT licensed!
January 23, 2025 at 1:45 PM
Read more about the Serverless Inference API in the documentation!

https://huggingface.co/docs/api-inference
November 19, 2024 at 4:15 PM
🔥 Finally, if you are willing to get started quickly and experiment with LLMs feel free to give the recently released Inference Playground a try!

https://huggingface.co/playground
November 19, 2024 at 4:15 PM
👨‍💻 Alternatively, you can also use the Serverless Inference API programmatically via cURL, the huggingface_hub Python SDK, the openai SDK for chat completion, and much more!

Find all the alternatives at https://huggingface.co/docs/api-inference
November 19, 2024 at 4:15 PM
🔎 Now let's explore some of the different alternatives to run inference via the Serverless API!

The most straightforward one is via the Hugging Face Hub available on the model card of the Serverless API supported models!
November 19, 2024 at 4:15 PM
🔒 Before going on, you will first need to generate a Hugging Face fine-grained token with access to the Serverless API, as the requests need to be authenticated so keep the token safe and avoid exposing it!
November 19, 2024 at 4:15 PM
❄️ Additionally, there are a bunch of models (around 1000) that are "Cold", meaning that those are not loaded in the Serverless API, but can be loaded when sending a request to them, also meaning that the first request may take a while until the model is loaded!
November 19, 2024 at 4:15 PM
You may be wondering how do you know what models out of the over a million publicly available on the Hub can be used via the Serverless API?

Well, we have the "Warm" tag that indicates that a model is loaded in the Serverless API and ready to be used 🔥
November 19, 2024 at 4:15 PM
💡 Did you know that you can use over 13700 public open models and adapters on the @huggingface Hub for FREE?

You just need a free account on the Hugging Face Hub (you can also subscribe to PRO to increase the requests per hour)

More details on the thread 🧵
November 19, 2024 at 4:15 PM