Michal
@arathunku.com
850 followers 110 following 1.2K posts
https://arathunku.com 🧙‍♂️💻 #SRE & Platform things #Elixir #ElixirLang 🐕🐾 #luna #dog owner 🏃🏃 #running 🇵🇱 🧳🚗➡️ 🇩🇪
Posts Media Videos Starter Packs
Yep! Liczę na to, że niedługo będzie można pluginy per projekt łatwo włączyć/wyłączyć (o ile już tego nie ma?). Wszystko swoje trzymam w pluginach
Praktycznie wszystko mam globalne kosztem tego, że np. agent od kodu pierwsze co robi to wciąga do kontekstu pliki obok zmian aby zachować styl. Dzisiaj też stworzyłem agenta na haiku aby znajdywał lepiej potrzebne skille do użycie do danego planu ale chyba już tego nie będzie potrzebna 😅
Można z agentami to też było zrobić ale teraz można lepiej ograniczyć context
- komendy - skróty do innych rzeczy, nic więcej
- skill - specyficzna funkcja
- agent - zbiór skilli i organizacja o określonym celu

Mam wiele agentów, z różnymi workflow korzystających częściowo z tych samych skilli.

Użycie skilla idealnie pasuje jako Task z Haiku modelem
Crazy!!! New screen

4x 100 width horizontal
2x 37 lines vertical

All readable at arm's length
I miss execution time but w/e, I just needed something to feed back later into improving agents/commands.
Reposted by Michal
Good coding agent advice
Good coding agent advice
I'm two weeks behind on the newsletter (/newsletter/), so I was trying to be responsible by resisting the urge to document the success I've had with my current coding agent setup. My self-restraint has paid off, as Peter Steinberger essentially wrote the exact post I was planning (https://steipete.me/posts/just-talk-to-it) to write. There's lots of good nuggets in here, and it's uncanny how many I agree with: 1. I also use Codex CLI (https://developers.openai.com/codex/cli/) (well, this fork (https://github.com/just-every/code)) on a $200 ChatGPT Pro (https://openai.com/index/introducing-chatgpt-pro/) plan. Claude Code was an epiphany, but their models are overrated for the task, whereas GPT 5's variants are more adherent and diligent across the board. OpenAI's usage limits are virtually infinite by comparison, too 2. I run 3-6 agents in parallel (usually up to 3 per project and up to 2 projects at a time). Unlike Peter, it's rare I let two agents edit the same codebase simultaneously. GPT 5's codex-medium variant is so fast that the time-consuming activities are brainstorming, researching, unearthing technical debt, and planning refactors 3. While git worktrees (https://git-scm.com/docs/git-worktree) are a very cool feature, they dramatically slow down code integration with merge conflicts. Additionally, I've found it's hard to avoid API and port conflicts when running numerous development instances simultaneously. And when an environment stops working, agents will silently start coding based on speculation and conjecture. Fast feedback through observable execution of code is the single most important thing , so the risk isn't worth the (marginal) reward 4. Hooks, custom commands, and fancy hacks like coder's undocumented auto-drive mode (https://github.com/just-every/code/blob/main/code-rs/tui/src/chatwidget/auto_coordinator.rs#L158) are nice, but they're no replacement for thinking really hard about what you want But really, the reason I've had so much success with Codex in comparison to Claude is that if you get off your ass and do the hard thinking necessary to arrive at an extremely crisp and well-informed articulation of what you want, why you want it, and what obstacles it will face, today's agents will generally do a really good job.
justin.searls.co
#Niri bankruptcy, Dell U4025QW ordered
My next goal is to hook into post tool usage by Claude and include that as part of atuin history with "claude-code" as hostname... (at least until github.com/atuinsh/atui...) so basically track what agent is doing to further optimize it later / extract into scripts.

Atuin is awesome <3
Again, this is not for security but to prevent claude code to make stupid mistakes. It's too smart for its own good.

For security, I'm playing with github.com/containers/b...
$ anion hook-validate 'Bash' --command "elixir -e ''"
Validating: Bash with parameters: {"command":"elixir -e ''"}
CWD: /home/arathunku/code/github.com/arathunku/anion
Commands: ["elixir"]

Checking BLOCK rules...
  ✗ pattern="env *" - no match
  ✗ pattern="RAILS_ENV*" - no match
  ✗ pattern="RACK_ENV*" - no match
  ✗ pattern="NODE_ENV*" - no match
  ✗ pattern="MIX_ENV*" - no match
  ✗ pattern="printenv *" - no match
  ✗ exact="git" - no match
  ✗ exact="find" - no match
  ✗ pattern="grep *" - no match
  ✓ pattern="elixir -e*" - MATCH
    Reason: Use mix or tidewave.

────────────────────────────────────────────────────────────

Decision: BLOCK
Reason: Use mix or tidewave.
My @atuin.sh history pays off big time now.

Claude spawned like 20+ agents, analyzed all commands, extracted more "subcommands", ran them against few agents with various evaluation skills and now created a policy for my hook tool to better guide/block/allow claude code to work.
I think I may need to block test commands without path for Claude code, shorten the iteration loop... yes... it's rails app...
3-month-old played and fell asleep on their own while I was unloading the dishwasher

brb, getting a lottery ticket
- Well, claude failed at X...
- what's your setup, what's in the context?
- setup what...?

😬
Claude code went on a wild chase into usage of ast-grep ast-grep.github.io/playground.h... and now my Bash hook validation is finally handling all commands and I'll be able to block process substitution for Claude. Claude doesn't know when to step its rabbit holes.
Playground | ast-grep
ast-grep playground is an online tool that lets you explore AST, debug custom lint rules, and inspect code rewriting with instant feedback.
ast-grep.github.io
ok, I see bsky.app/profile/simo... - it's mainly token optimization so most likely we should just name skills in subagents and they should make up their mind on their own when they should be used
Each skill spells out a technique, like root cause debugging: github.com/obra/superpo...

The really clever part is that the coding agent is told to read that full documentation only when it actively needs to apply that skill, which saves a ton of tokens in the general case
I kept wondering yesterday after reading the post, what's the main difference between specialized subagents (that can fire off other subagents too) vs skills?

Should we compose subagents from set of skills? Why would we need (yet) another abstraction to help LLM? Any thoughts, benchmarks?
Amazing!!! Thank you in advance:D I'll just let Claude analyze my conversations and extract missing skills... 😅