jkantarek.bsky.social
banner
jkantarek.bsky.social
jkantarek.bsky.social
@jkantarek.bsky.social
Ruby engineer and Chicago native
lmstudio.ai on commodity hardware (eg ryzen 3 2200G from 2018) can service a very passable 10-13 tokens per second. It seems like Vulkan accelerated llama.cpp just _works_ out of the box! If you have an old box lying around upgrade its ram and play around!
December 22, 2024 at 3:25 PM