The Flaky Wanderer
flakywanderer.bsky.social
The Flaky Wanderer
@flakywanderer.bsky.social
An upcoming 'tiel 'tuber exploring some fresh air. Highly sussy, 18+ only
I wonder if a kernel MMD, as an efficiently estimated probability metric, would work in this regard
November 27, 2025 at 3:20 AM
Unfortunately the Wasserstein metric is rather computationally tough to just apply to LLMs
November 27, 2025 at 3:17 AM
Don't you already have impostor syndrome
November 26, 2025 at 9:20 PM
Is there a demand for a slice of Apple?
November 26, 2025 at 9:10 PM
Better conclusion: Our brains keep growing all through our life, it's just that when we're old other processes kick in that get in the way of that
November 26, 2025 at 8:11 PM
Newsom has the chance to do something awesome in this regard
November 26, 2025 at 8:09 PM
Those nuclear blocklists have got to be kept in sync somehow
November 26, 2025 at 7:37 PM
Who can I ping to ask for technical details about bsky
November 26, 2025 at 7:37 PM
I wonder if it becomes more feasible if you incorporate yourself as part of the network so you can watch for things as they pass through your node
November 26, 2025 at 7:34 PM
Your other posts are fine. You've made up for your mistake, don't let it stop you permanently.
November 26, 2025 at 7:31 PM
The Chinese response to needing more energy:

www.reuters.com/business/ene...
www.reuters.com
November 26, 2025 at 7:08 PM
Reposted by The Flaky Wanderer
The reason he posed it that way is currently on display:

There is a significant group of people on this site who imagine that brigading replies and writing abusive quote-posts will do something to change policy.

Or if they don’t really imagine that, at least it’s how they get off.
November 26, 2025 at 2:21 PM
Transformers are still unbeaten at finding associations between tokens over long distances
November 26, 2025 at 1:24 AM
They're gradually working their way up to a full state-space model, pending research

Kimi uses a 3/4 state space model right now (the other 1/4 are the good ol transformers)
November 26, 2025 at 1:22 AM
Relevant research

bsky.app/profile/timk...
on #3, this paper uses a method where they can directly attribute specific documents from the pretraining dataset

they used it to show that LLMs do in fact learn procedures, not just autocomplete. But you could take this so much further with Olmo3

arxiv.org/abs/2411.12580
November 25, 2025 at 9:21 PM
Also select them randomly from a pool, and give them some compensation for their judgement
November 25, 2025 at 8:15 PM
You know, what if we gathered 12 of them together to decide cases?
November 25, 2025 at 8:14 PM