Lightnews — Scholar-powered news

The Flaky Wanderer

@flakywanderer.bsky.social

640 followers 1.2K following 1.6K posts

An upcoming 'tiel 'tuber exploring some fresh air. Highly sussy, 18+ only

Posts Replies Media Videos

The Flaky Wanderer

@flakywanderer.bsky.social

I wonder if a kernel MMD, as an efficiently estimated probability metric, would work in this regard

November 27, 2025 at 3:20 AM

The Flaky Wanderer

@flakywanderer.bsky.social

Unfortunately the Wasserstein metric is rather computationally tough to just apply to LLMs

November 27, 2025 at 3:17 AM

The Flaky Wanderer

@flakywanderer.bsky.social

Don't you already have impostor syndrome

November 26, 2025 at 9:20 PM

The Flaky Wanderer

@flakywanderer.bsky.social

Is there a demand for a slice of Apple?

November 26, 2025 at 9:10 PM

The Flaky Wanderer

@flakywanderer.bsky.social

Better conclusion: Our brains keep growing all through our life, it's just that when we're old other processes kick in that get in the way of that

November 26, 2025 at 8:11 PM

The Flaky Wanderer

@flakywanderer.bsky.social

Newsom has the chance to do something awesome in this regard

November 26, 2025 at 8:09 PM

The Flaky Wanderer

@flakywanderer.bsky.social

Those nuclear blocklists have got to be kept in sync somehow

November 26, 2025 at 7:37 PM

The Flaky Wanderer

@flakywanderer.bsky.social

Who can I ping to ask for technical details about bsky

November 26, 2025 at 7:37 PM

The Flaky Wanderer

@flakywanderer.bsky.social

I wonder if it becomes more feasible if you incorporate yourself as part of the network so you can watch for things as they pass through your node

November 26, 2025 at 7:34 PM

The Flaky Wanderer

@flakywanderer.bsky.social

Your other posts are fine. You've made up for your mistake, don't let it stop you permanently.

November 26, 2025 at 7:31 PM

The Flaky Wanderer

@flakywanderer.bsky.social

The Chinese response to needing more energy:

www.reuters.com/business/ene...

www.reuters.com

November 26, 2025 at 7:08 PM

Reposted by The Flaky Wanderer

Ted Underwood

@tedunderwood.com

The reason he posed it that way is currently on display:

There is a significant group of people on this site who imagine that brigading replies and writing abusive quote-posts will do something to change policy.

Or if they don’t really imagine that, at least it’s how they get off.

November 26, 2025 at 2:21 PM

The Flaky Wanderer

@flakywanderer.bsky.social

Transformers are still unbeaten at finding associations between tokens over long distances

November 26, 2025 at 1:24 AM

The Flaky Wanderer

@flakywanderer.bsky.social

They're gradually working their way up to a full state-space model, pending research

Kimi uses a 3/4 state space model right now (the other 1/4 are the good ol transformers)

November 26, 2025 at 1:22 AM

The Flaky Wanderer

@flakywanderer.bsky.social

Relevant research

bsky.app/profile/timk...

Tim Kellogg @timkellogg.me · 3d

on #3, this paper uses a method where they can directly attribute specific documents from the pretraining dataset

they used it to show that LLMs do in fact learn procedures, not just autocomplete. But you could take this so much further with Olmo3

arxiv.org/abs/2411.12580

A comic-style infographic titled “THE AI CHEF’S ‘PROCEDURAL’ SECRET: AN ATTRIBUTION ANALOGY.” It uses a robot chef baking a soufflé to explain how attribution and gradient-based tracing in AI works. The diagram proceeds left to right in five labeled steps.

⸻

1. THE TASK (REASONING)

A friendly robot chef stands in a kitchen, holding up a perfectly baked soufflé. A math bubble shows x + 2y = 10 as an analogy for solving a problem.
Caption: AI Chef (LLM) solves a problem (bakes a soufflé).

⸻

2. THE “FINGERPRINT” (GRADIENT)

Close-up of the robot whisking batter. A glowing network of abstract swirls appears over the bowl.
Caption: We record the exact, unique actions & “effort” (Gradient) used for this specific soufflé.

⸻

3. THE “BRAIN MAP” (EK/FAC)

The robot stands before floating diagram bubbles labeled Whisking Techniques, Aeration Physics, Heat Transfer, Simplified Linkages.
Caption: We use a simplified map of how the chef connects concepts (Hessian/EK-FAC approximation).

⸻

4. THE LIBRARY MATCH (ATTRIBUTION)

The robot enters a vast library with floor-to-ceiling bookshelves. A giant glowing fingerprint projection shines onto one shelf as the robot scans for the best match.
Caption: We scan the entire “cookbook library” (pre-training data) to find which book’s instructions best match the fingerprint via the brain map.

⸻

5. THE RESULT: PROCEDURAL KNOWLEDGE

The robot chef proudly holds a glowing lightbulb while a book opens nearby with a concept diagram. A large reference book beside him is titled “THE PHYSICS OF FOAMS & AERATION (NOT a Soufflé Recipe Book!)”
Caption: We find the source was NOT a recipe, but a foundational PRINCIPLE (procedural knowledge) applied to a new task.

⸻

Overall, the image uses the story of baking a soufflé to explain how AI models trace reasoning: capturing gradients, mapping conceptual relations, searching training data, and revealing underlying procedural knowledge rather than direct memorization.

November 25, 2025 at 9:21 PM

The Flaky Wanderer

@flakywanderer.bsky.social

Also select them randomly from a pool, and give them some compensation for their judgement

November 25, 2025 at 8:15 PM

The Flaky Wanderer

@flakywanderer.bsky.social

You know, what if we gathered 12 of them together to decide cases?

November 25, 2025 at 8:14 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news