saganite.bsky.social
@saganite.bsky.social
llm tinkerer. entropy cowboy. iconoclast.
I would like to share some work we've been doing at cascadetech.ai: Predicted Outputs in vLLM. If you aren't familiar with PO, it allows you to dramatically speed up generation when you know something about the contents of the output (think: code modification).
October 10, 2025 at 6:11 PM
November 24, 2024 at 10:54 PM