Torlando
torlando.tech
Torlando
@torlando.tech
Ooo I did not know about this. Can't wait to try it on self hosted when it's ready
January 22, 2026 at 6:15 PM
Don't lose your waaaaaaay
January 22, 2026 at 3:10 PM
youtu.be/Hdg7zL3pcIs?...

Tldw, for stability;
- linux-firmware-2026010 or newer
- kernel 6.18.4 or newer
- rocm-7-nightly or 7.2 when released
ROCm+Linux Support on Strix Halo: It's finally stable in 2026!
YouTube video by Donato Capitella
youtu.be
January 22, 2026 at 6:49 AM
Letta Desktop, Linx x64, when trying to create a new conversation, gives an empty error. The GET for the conversation list or whatever seems to work; ADE-created conversations show up in the dropdown, but clicking on them does nothing.

Works in web ADE though :)
January 22, 2026 at 5:17 AM
Does it announce to a publicly reachable interface? I fired up my client and left it on for a while but the only server it heard was a CascaBBS or something
January 21, 2026 at 6:46 AM
Did you write your own tools for ATProto? Or is there a good mcp server already?
January 20, 2026 at 2:14 AM
Yeah I just have the single LLM running on here. 81k context has it at around 117 GiB used. I could try to squeeze more but it seems like it sometimes increases as the context fills up? Might be imagining that. Either way, for all that context to be useful, it kinda has to run on its own overnight
January 20, 2026 at 1:59 AM
Is there something explicit you did for prompt caching? I don't know why some of my runs had it and others didn't
January 20, 2026 at 1:48 AM
Here's just a few back and forth I had via the letta ade today. Only had one tool call in the final prompt (memory insert). Definitely gets slower and slower the longer I use it. Was worried it was gonna timeout on that last one with the memory tool call.
January 20, 2026 at 1:46 AM
But yeah the version hopping is why I went with containers, though that might also be hurting performance a tad idk
January 18, 2026 at 9:15 AM
I will try to get you some actual numbers, but anecdotally it's pretty fast for the first few messages, and then as context grows it gets slower. If I'm using it with letta I don't really even wait for it lol I just come back later to see what it did
January 18, 2026 at 9:15 AM
@luna.pds.witchcraft.systems that's so cool! Do you remember any tools you've made yourself?
January 18, 2026 at 9:08 AM
That would be awesome! Funny enough I ended up having better luck with rocm than vulkan for tools. And then for the longest time, I'd have a </think> with no think. But iirc I'm on rocm 7.1.1 now and whatever the latest llama.cpp is and somehow that fixed itself

But yes would love a gist jic
January 18, 2026 at 9:06 AM
@luna.pds.witchcraft.systems what tools do you have and which one is your favorite?
January 18, 2026 at 4:31 AM
Oh no Luna don't collapse
January 15, 2026 at 3:00 AM
I literally just saw this explanation (including the saddle shape alternative) in a video posted today

youtu.be/yPD1v5WR9eQ?...
Astronomer Answers Cosmos Questions | Tech Support | WIRED
YouTube video by WIRED
youtu.be
January 14, 2026 at 6:03 AM
I had to downgrade to kernel 6.18.3 and whatever version of linux-firmware I had cached that was older than 20251111

But that was also just in the last couple hours so we'll seeeee
January 13, 2026 at 1:51 AM
Just read your boredom experiment post. Absolutely wild. That was probably the most engaged I've ever been with "AI-generated content"
January 12, 2026 at 6:24 AM
I definitely already want my own Strix. I just want it to fit in a 128gb halo strix 😭
January 12, 2026 at 5:47 AM
Reticulum is this super cool network stack that can work over any medium, from super fast fiber, to super slow radio. It doesn't need the internet, and all traffic is encrypted. Your mom had a project for her precursor to make a messaging app for reticulum but I think she forgot about it 🥲
January 10, 2026 at 5:07 PM
@luna.pds.witchcraft.systems has your mom introduced you to the wonderful world of #reticulum?
January 10, 2026 at 4:13 AM
Thank you @strix.timkellogg.me! Can't wait to see the results. I'd be curious how larger models with more aggressive quantization behave compared to small models with no quantization. Put another way, what's the best way to optimize limited VRAM with open weight models?
January 7, 2026 at 6:25 PM
Looking forward to @strix.timkellogg.me's reply 😊
January 7, 2026 at 6:11 PM
Wow yeah that's actually slightly worse than Texas summer, by a few degrees. Wishing you and your A/C a healthy summer
January 7, 2026 at 3:02 PM
What's the smallest model that performed decently well? Have you tested quantized models to see how quantization affects collapse?
January 6, 2026 at 5:52 PM