Erik
banner
erikkaum.bsky.social
Erik
@erikkaum.bsky.social
SWE @hf.co
There’s some deep wisdom in that as well!
December 8, 2024 at 2:38 PM
Exactly.

Suppose we have an algorithm that is guaranteed to give output according to a structure, with the caveat that it might run out of tokens.

Should this still be classified as structured generation?
December 6, 2024 at 9:42 AM
🤔
December 5, 2024 at 8:22 PM
CUDA libraries..? So they have access to gpus as well? 👀
December 5, 2024 at 7:33 PM
A video series on how to develop, profile and compare cuda kernels would be such a banger.

And allow a lot more tinkerers to enter the field.
December 5, 2024 at 7:32 PM
Hell yeah 🔥

How would you classify the edge case when running out of tokens?

E.g if it goes into a ”\n” loop and runs out of tokens.
December 5, 2024 at 7:29 PM
Hah, fair!
December 1, 2024 at 11:01 AM
Interesting, for me it's snappy as hell, maybe things aren't cached as well in Costa Rica? 🤔
December 1, 2024 at 10:15 AM
pro tip for the borrow-checker, using .clone() everywhere is okay 🙌
December 1, 2024 at 10:13 AM
Or then you can let the model run free in a constrained environment.

I’m tinkering on this: bsky.app/profile/erik...
A while ago I started experimenting with compiling the Python interpreter to WASM.

To build a secure, fast, and lightweight sandbox for code execution — ideal for running LLM-generated Python code.

- Send code simply as a POST request
- 1-2ms startup times

github.com/ErikKaum/run...
GitHub - ErikKaum/runner: Experimental wasm32-unknown-wasi runtime for Python code execution
Experimental wasm32-unknown-wasi runtime for Python code execution - ErikKaum/runner
github.com
November 27, 2024 at 3:53 PM
Nice! This is so neat 🙌🏽
November 26, 2024 at 10:23 PM