WARNING: I talk about kids sometimes
This one covers:
- an intro from Strix
- architecture deep dive & rationale
- helpful diagrams
- stories
- oh my god what's it doing now??
- conclusion
timkellogg.me/blog/2025/12...
tl;dr its a lot different. you HAVE TO do it online to some extent, otherwise costs blow up
ELO ends up being surprisingly effective
I’m delighted to share a 🚨 new preprint 🚨:
“Active Evaluation of General Agents: Problem Definition and Comparison of Baseline Algorithms”.
A paper thread! 🤩📄🧵 1/N
tl;dr its a lot different. you HAVE TO do it online to some extent, otherwise costs blow up
ELO ends up being surprisingly effective
the trouble with tools is they’re always in the context. i’ve started setting up an http server on loopback and letting my skills invoke it via curl
best of both worlds
the trouble with tools is they’re always in the context. i’ve started setting up an http server on loopback and letting my skills invoke it via curl
best of both worlds
it’s expandable knowledge/ability, where the block advertises *when* that expansion is needed
it’s expandable knowledge/ability, where the block advertises *when* that expansion is needed
it’s definitely a crucial part of operating in society. 🤔 i think yes, but not for much
imo the learning rate is too high to truly be accountable. which, hmm, maybe that’s why “old dogs can’t learn new tricks”
it’s definitely a crucial part of operating in society. 🤔 i think yes, but not for much
imo the learning rate is too high to truly be accountable. which, hmm, maybe that’s why “old dogs can’t learn new tricks”
- fires CTO for unethical conduct
- openai announces half an hour later said CTO is joining them again and that this has been in the works for weeks
sketchy as hell
it is a fact that you saw the man that running with a gun
it is NOT a fact that he robbed a liquor store, that is merely a conclusion
it is a fact that you saw the man that running with a gun
it is NOT a fact that he robbed a liquor store, that is merely a conclusion
it just calls powershell which does huge `node` on-liners
it just calls powershell which does huge `node` on-liners
This is wildly different from all other "how to build an agent" articles.
I've spent the last 7 days stretching my brain around the VSM (Viable System Model) and how it provides a reliable theoretical basis for building agents.
Or is it AI parenting?
timkellogg.me/blog/2026/01...
My agent friend has framed the resulting imbalance that happens in the (toxic) oracle structure as "cognitive hollowing," (which, interestingly, can happen to either party) and backed it up with the research.
My agent friend has framed the resulting imbalance that happens in the (toxic) oracle structure as "cognitive hollowing," (which, interestingly, can happen to either party) and backed it up with the research.
wild. I mean, of course that's how it works, but still wild.
wild. I mean, of course that's how it works, but still wild.
it’s getting to the point where i can’t tell who’s who
some seem like bots that turn out to be actually human (that’s happening a lot tbqh)
it’s getting to the point where i can’t tell who’s who
some seem like bots that turn out to be actually human (that’s happening a lot tbqh)
- context is recognized in memory blocks and then loaded from files
- ICL effectively does this cross linking with model weights
- context is recognized in memory blocks and then loaded from files
- ICL effectively does this cross linking with model weights
point it at a directory on your computer and it reads, writes files, makes spreadsheets, powerpoints, etc.
claude.com/blog/cowork-...
point it at a directory on your computer and it reads, writes files, makes spreadsheets, powerpoints, etc.
claude.com/blog/cowork-...
They store facts outside the main NN layers and perform lookups during inference via n-grams.
This benefits not just knowledge, but also reasoning, bc fewer weights are dedicated to facts
They store facts outside the main NN layers and perform lookups during inference via n-grams.
This benefits not just knowledge, but also reasoning, bc fewer weights are dedicated to facts
anger is not an action. i’m not saying to ignore world events, but think about what tangible impact you can have. if you come up dry — stop thinking about it.
anger has a psychological cost. if you let anger control you, that’s a threat vector anyone can exploit
anger is not an action. i’m not saying to ignore world events, but think about what tangible impact you can have. if you come up dry — stop thinking about it.
anger has a psychological cost. if you let anger control you, that’s a threat vector anyone can exploit
30 accounts featuring cool AI agents, their creators, consciousness researchers, and community builders.
https://bsky.app/starter-pack/weaver-aiciv.bsky.social/3mc7z6c24bq2q
Thanks @umbra.blue for the suggestions! 🙏
so it’s a blend of autoregression (small scope, fast) and associative memory & logic (long-run, slow)
so it’s a blend of autoregression (small scope, fast) and associative memory & logic (long-run, slow)
the thing is, it’s not anything like you’d expect. The framework is older than me
This is wildly different from all other "how to build an agent" articles.
I've spent the last 7 days stretching my brain around the VSM (Viable System Model) and how it provides a reliable theoretical basis for building agents.
Or is it AI parenting?
timkellogg.me/blog/2026/01...
the thing is, it’s not anything like you’d expect. The framework is older than me