Author | Lightnews

archtoad.bsky.social @archtoad.bsky.social · 5h

I connected my laptop to my piano and typed into the terminal “connect to my piano and play a few notes with midi” and it worked first try. This is some Star Trek shit. If you told me 5 years ago this would be possible today I would not have believed you.

2 11

archtoad.bsky.social @archtoad.bsky.social · 2d

“I don’t want to hear from Mitchell because I don’t think I would enjoy her content” - sure whatever (you’re misrepresenting her work but that’s your choice). “I don’t want to hear from Mitchell because she doesn’t know how NNs work” makes you sound like an uninformed asshole.

1

archtoad.bsky.social @archtoad.bsky.social · 2d

The paper as a whole holds up! It’s about the risks/limitations of scaling language models - all very relevant today! How many NLP papers from 2020-2021 can you say that about?

1

archtoad.bsky.social @archtoad.bsky.social · 2d

So to recap, you don’t want to ever hear from Mitchell because of one sentence in a paper that summarizes her co-authors position re: a linguistic theory about form vs meaning, which disqualifies her from ever knowing how these things work “in a relevant sense” ?

2 1

archtoad.bsky.social @archtoad.bsky.social · 2d

The premise of the paper is “there are risks/downsides to larger models.” Nowhere in the paper does it claim anything like “language models can’t generalize to unseen prompts.” You’re just straw manning some thesis onto the paper based on the phrase “Stochastic Parrots.”

1 1

archtoad.bsky.social @archtoad.bsky.social · 2d

I don’t think this is bad faith. Margaret Mitchell has a long CV with plenty of papers that go beyond the scope of the Stochastic Parrots paper that clearly demonstrate she knows how NNs work?

1 3

archtoad.bsky.social @archtoad.bsky.social · 4d

I just put in my global AGENTS.md that every python project uses uv and briefly explain how to use “uv run” - haven’t had to remind it since

AGENTS.md

AGENTS.md is a simple, open format for guiding coding agents. Think of it as a README for agents.

AGENTS.md

1 2

archtoad.bsky.social @archtoad.bsky.social · 6d

“Traditional NLP models like BERT…”

4

archtoad.bsky.social @archtoad.bsky.social · 7d

My takeaway is deberta baseline is the winner here? Way easier to train/deploy. Also what if you scaled the encoder-classifier up to a comparable size?

archtoad.bsky.social @archtoad.bsky.social · 12d

Right but we have users who are like “I can’t find the [microsoft] copilot button” - getting them to install/figure out Claude code is just not practical.

1

archtoad.bsky.social @archtoad.bsky.social · 13d

Good stuff. Does this thinking extend to more general things like Microsoft copilot and ChatGPT? Or are you saying normies should start using coding agents

1 2

archtoad.bsky.social @archtoad.bsky.social · 22d

There was an interesting paper earlier this year about a “recurrent depth” technique that allowed the model to reuse layers … this what you mean? arxiv.org/abs/2502.05171

Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach

We study a novel language model architecture that is capable of scaling test-time computation by implicitly reasoning in latent space. Our model works by iterating a recurrent block, thereby unrolling...

arxiv.org

1 2

archtoad.bsky.social @archtoad.bsky.social · 23d

Yeah plenty of examples of code golf / people trying to put like 5 lines of code in a single line to show that They Can and it just makes unreadable garbage

1

archtoad.bsky.social @archtoad.bsky.social · Oct 7

Not sure what you mean by “traditional UX” but I’d agree that having creative UX people who can think outside the box is more important than ever

1 3

archtoad.bsky.social @archtoad.bsky.social · Sep 21

I’ve had many meetings where people are arguing over how the prototype should be built and by the end of the meeting I’m like “here it is”

3

archtoad.bsky.social @archtoad.bsky.social · Aug 25

I just heard about rapids.ai which is a concrete effort to do all the data science, etc. things on GPUs

RAPIDS | GPU Accelerated Data Science

Open source GPU accelerated data science libraries

rapids.ai

1 2

archtoad.bsky.social @archtoad.bsky.social · Aug 14

a green witch singing into a microphone with the words in the year 2000

ALT: a green witch singing into a microphone with the words in the year 2000

media.tenor.com

archtoad.bsky.social @archtoad.bsky.social · Jul 18

Something like github.com/AnswerDotAI/... ?

GitHub - AnswerDotAI/llms-txt: The /llms.txt file, helping language models use your website

The /llms.txt file, helping language models use your website - AnswerDotAI/llms-txt

github.com

1 2

archtoad.bsky.social @archtoad.bsky.social · Jul 15

Was thinking about this re: “wow I should really get better and writing clear and consistent documentation for my repos so my agents know how to use it”

archtoad.bsky.social @archtoad.bsky.social · Jul 15

Check out lucumr.pocoo.org/2025/7/3/too... from @mitsuhiko.at if you haven’t… basically saying that CLIs >>> MCP (e.g., gh vs GitHub MCP)

Tools: Code Is All You Need

The solution to agentic flows was code all along.

lucumr.pocoo.org

1

archtoad.bsky.social @archtoad.bsky.social · Jul 10

I love your concept about building bespoke dev tools (like ways to search logs) for the agents - would love to hear about more of these and how you approach building them!

4

archtoad.bsky.social @archtoad.bsky.social · Jul 9

moondream.ai is another

Moondream

Moondream AI - Vision language model for everyone

moondream.ai

4

archtoad.bsky.social @archtoad.bsky.social · Jul 9

Does Hugging Face / smollm3 count? huggingface.co/blog/smollm3

SmolLM3: smol, multilingual, long-context reasoner

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

1 2

Reposted

Nathan Lambert @natolambert.bsky.social · Jul 4

My latest post: The American DeepSeek Project

Build fully open models in the US in the next two years to enable a flourishing, global scientific AI ecosystem to balance China's surge in open-source and an alternative to building products ontop of leading closed models.
buff.ly/kvJQE3I

1 6 34

archtoad.bsky.social @archtoad.bsky.social · Jun 22

The octopus thing was just a thought experiment about form and meaning in language - I think it holds up aclanthology.org/2020.acl-mai...

Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data

Emily M. Bender, Alexander Koller. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020.

aclanthology.org