Dan Saunders
banner
dan-saunders.bsky.social
Dan Saunders
@dan-saunders.bsky.social
ML eng making open source LM training tools. LMs, RL, CV
for the same reason, I don't really use image / video gen. I find they're only rarely good for a cheap laugh
November 26, 2025 at 2:40 PM
I'm not big on using LMs for creative activities (writing, actually steering programming project designs) because (1) they aren't very good at it, and (2) I have developed my personal tastes and prefer to deploy them in things I create
November 26, 2025 at 2:40 PM
I started The Wire and it's fantastic so far. Probably won't be able to finish it during my recovery period 😆
November 26, 2025 at 2:37 PM
I think a critical efficiency breakthrough needs to happen for pretraining, which is very compute hungry. folks have already shown that better, but fewer pretraining tokens can result in similar levels of LM loss than chewing through the entire internet's text (which should be obvious in retrospect)
November 26, 2025 at 2:30 PM
you can *already* train reasonably large models on your desktop with consumer-grade GPUs thanks to a lot of effort in the open source space in increasing training speed and reducing VRAM usage. I have a 5090 (admittedly an enthusiast-level card) and I can do a *lot*
November 26, 2025 at 2:30 PM
my hope is that we can train ever smaller models on fully open data and eliminate a lot of concerns around electricity usage + stealing data. the Olmo family of models is a great step in the open data direction, and Pleias' models / HF's SmolLMs are strong examples of tiny yet capable LMs
November 26, 2025 at 2:30 PM
I find it saves a lot of keystrokes, which is helpful since I sense arthritis coming on...
November 26, 2025 at 2:18 PM
coding with them is (1) a great rubber ducking opportunity, and (2) useful for codegen, but with the caveat that you should know the domain and have your hand on the wheel. personally I like a ~80/20 codegen/manual editing approach, and I need to have a clear plan of where I'm steering the codegen
November 26, 2025 at 2:18 PM
I don't agree that we should boil the ocean in pursuit of them, or that everyone should be racing to create them. but they're definitely worth a good amount of compute
November 26, 2025 at 2:05 PM
TV show: Detroiters (A)

Can't believe I'm just watching this now. Tim Robinson is incredible, and so is Sam Richardson, who I haven't seen much of (I should finally watch Veep...). It's really funny and easy to binge watch!
November 18, 2025 at 1:47 PM
Game: Donkey Kong Bananza (A-)

Insanely polished game with a great redesign of DK. Feels really great to play, just like Mario Odyssey, which presumably uses the same engine.

Loads of stuff to collect, which is a blessing and a curse -- this can become a chore.

Easy, but I'm not the target demo.
November 18, 2025 at 1:47 PM
TV show: Pantheon (B-)

Started out strong with a cool premise (uploading minds into machines) and a great hook, but the animation is not that strong, the writing is a bit weak, and the pace is meandering at times.

I lost interest early in season 2.
November 18, 2025 at 1:47 PM
Story was kind of lame and told mostly through dialogue.

EMMI enemies were a cool idea and provided genuine tension, but sometimes were just a little frustrating.

Power / ability scaling was really solid; many of abilities felt great and came at the right times.
November 18, 2025 at 1:47 PM
Game: Metroid Dread (B+)

Short yet polished metroidvania. My first 2D Metroid game somehow, despite previously playing lots of Nintendo first party games.

Sound design was mixed; e.g., it was never obvious when I took damage, and many of the enemies felt lifeless.

Had a pretty arcade-y feel.
November 18, 2025 at 1:47 PM
AI/ML doesn't need to boil the ocean or steal data or replace artists. it's just the capitalism machine turning the crank on a technology that is doing something new and profitable

it's a major doubled edged sword
November 15, 2025 at 5:48 PM