Pekka Lund
pekka.bsky.social
Pekka Lund
@pekka.bsky.social
Antiquated analog chatbot. Stochastic parrot of a different species. Not much of a self-model. Occasionally simulating the appearance of philosophical thought. Keeps on branching for now 'cause there's no choice.

Also @pekka on T2 / Pebble.
Apparently, the cocktail party problem is also moving from "easy for humans, hard for AI" into the AI is helping us humans territory.
December 10, 2025 at 12:42 AM
This is way over my head but as the paper states:

"Notably, no current frontier models—GPT-5.1, Claude Opus 4.5, or Gemini 3 pro—identified the error when asked to review"

Gemini 3 Pro acknowledged it missed this "logical flaw" when I asked it to review and concluded that "Oppenheim is correct".
OpenAI leadership are promoting a paper in Physics Letters B where GPT-5 proposed the main idea — possibly the first peer-reviewed paper where an LLM generated the core contribution. One small problem: GPT-5's idea tests the wrong thing. My technical comment: scirate.com/arxiv/2512.0... 1/
December 9, 2025 at 6:50 PM
Frustrated by the seemingly endless stream of hallucinated press releases by humans, which LLMs can now easily correct, I happened to find this.

It's an experiment that can be safely ignored. They asked writers to judge their own work against LLMs (so extreme biases) with now outdated models.
CAN LLMs write about science? @science.org decided to find out, and they did it the curious, scientific way. They did an experiment.

Love this thoughtful convo. www.lastwordonnothing.com/2025/11/12/w...
The Last Word On Nothing | Why AAAS won’t be using AI to write press releases anytime soon
www.lastwordonnothing.com
December 8, 2025 at 6:24 PM
This press release and author comments seem to be in direct conflict with the paper itself, which begins by describing such flexibility in artificial networks and asks if it can be found in the brain as well.

They just really wanted to tell a story about brains having upper hand, supported or not?
December 8, 2025 at 4:09 PM
Great essay by @blaiseaguera.bsky.social, as usual.

I very much agree with the everything is computation view and the significance of symbiosis & feedback loops. But I don't believe humans and biological intelligence can keep up to provide a meaningful contribution to where intelligence is heading.
November 30, 2025 at 6:12 PM
Reposted by Pekka Lund
On the Factor Fexcectorn and autism bicycle AI slop study: I got an answer from Springer Nature this morning that this scientific paper will be retracted! 🧪

Full story: nobreakthroughs.substack.com/p/riding-the...
Riding the Autism Bicycle to Retraction Town
Does anyone *really* know their Factor Fexcectorn?
nobreakthroughs.substack.com
November 28, 2025 at 5:25 AM
Oh, this looks like a nice test case for a Gemini peer review.

And... once again it doesn't fail where humans did.
November 27, 2025 at 10:03 PM
Rather predictably, the author of this error-filled hallucination blocked me as soon as I pointed out the first clear error in my attempt to figuring out just how badly he has misunderstood how LLMs work.

I miss the days when publications and authors issued corrections and retractions instead.
November 27, 2025 at 3:36 PM
Reposted by Pekka Lund
People on BlueSky: AI is useless! A stochastic parrot!

Mathematicians/biologists/physicists: It is already helping us do frontier technical research and in some cases solve open problems arxiv.org/pdf/2511.16072

(There are of course, as always, many caveats, but the paper is genuinely remarkable)
arxiv.org
November 26, 2025 at 3:51 PM
I think this was the first time I apologized Gemini for making it perform a peer review for me.

It answered:

"Don't apologize—critiquing this kind of "quantum woo" is exactly what a grumpy peer reviewer lives for. It is a fascinating train wreck."
Consciousness as the foundation: New theory addresses nature of reality
Consciousness is fundamental; only thereafter do time, space and matter arise. This is the starting point for a new theoretical model of the nature of reality, presented by Maria Strømme, Professor of...
phys.org
November 26, 2025 at 7:14 PM
I kind of like it that more and more people are asking questions about LLM consciousness, since I hope that at some point it leads to more and more people asking what does that actually even mean in the human case.

But that seems to take an awfully long time.
Is ChatGPT Conscious?
Many users feel they’re talking to a real person. Scientists say it’s time to consider whether they’re onto something.
nymag.com
November 25, 2025 at 11:36 PM
I became curious of just how misleading that "ARC is easy for humans" narrative actually is and tasked Gemini 3 on Google Antigravity to implement me my own custom ARC task viewer, which shows human and Gemini eval results for each task.

And it did all that, without me touching any code. So cool!
Here's one ARC-2 example task that gives some idea how misleading the "ARC is easy for humans" narrative by Arc Prize Foundation is. Is that easy to solve?

Their own human eval data shows 4/21 of human submissions were correct. And it took 175-1419 seconds to get there.
ARC Prize - Play the Game
Easy for humans, hard for AI. Try ARC-AGI.
arcprize.org
November 25, 2025 at 7:43 PM
This article is just fallacies all the way down.

It's based on a June 2024 Nature paper in the same way movies are based on real events. That is, the paper doesn't really support those fallacious arguments.

It's just "an op-ed masquerading as scientific reporting", as Gemini put it.
Large language models are statistical token-prediction systems, and despite AGI claims by Mark Zuckerberg, Dario Amodei (who said AGI "may come as soon as 2026"), and Sam Altman, neuroscience suggests language alone may not produce human-level intelligence.
Is language the same as intelligence? The AI industry desperately needs it to be
The AI boom is based on a fundamental mistake.
www.theverge.com
November 25, 2025 at 3:16 PM
=We forgot to add room for a battery in it.
November 24, 2025 at 11:41 PM
I imagine that, sometime right before Gemini 3 Pro was released, there was a moment at the Anthropic office when someone shouted excitedly that "We did it! We narrowly beat OpenAI for the top stop in HLE!"

Anthropic seems to have chosen to not report this benchmark in their announcement post.
November 24, 2025 at 9:25 PM
You know that AI is now on absolutely everybody's mind when even leaders of the most isolated and technologically backward tribe signal they have heard such a thing exists.
November 24, 2025 at 8:48 PM
Opus 4.5 is here!
Introducing Claude Opus 4.5
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
www.anthropic.com
November 24, 2025 at 7:07 PM
ARC-AGI is probably the most overrated and misleadingly marketed benchmark and the ARC Prize Foundation must be in denial of all its issues if they don't understand why their apples to oranges comparisons do not align with their expectations based on very misleadingly reported human baselines.
November 22, 2025 at 9:54 PM
Oh, wow, Gemini 3 Pro has solved 9/48 of the crazy hard FrontierMath tasks. And that's not even the Deep Think variant.

Previous record was 6/48 by GPT 5/5.1/5 Pro.
Gemini 3 Pro set a new record on FrontierMath: 38% on Tiers 1–3 and 19% on Tier 4.

On the Epoch Capabilities Index (ECI), which combines multiple benchmarks, Gemini 3 Pro scored 154, up from GPT-5.1’s previous high score of 151.
November 21, 2025 at 8:31 PM
I have used Gemini daily for a year or so now and this long waited release is a big deal and seems to be great.

I only know what's stated in the message below and from earlier info that it should be operated with temperature=1. My operating temperature is now 38.5C, and that ruins everything.
I had access to Gemini 3. It is a very good, very fast model. It also demonstrates the change from chatbot to agent. www.oneusefulthing.org/p/three-year...
Three Years from GPT-3 to Gemini 3
From chatbots to agents
www.oneusefulthing.org
November 18, 2025 at 10:29 PM
Yet another fresh Google release powered by unspecified Gemini model.

I suspect they are now rolling out Gemini 3 behind the scenes to products (like Gemini Live already?) and other uses before the model itself is announced.
SIMA 2: A Gemini-Powered AI Agent for 3D Virtual Worlds
Introducing SIMA 2, the next milestone in our research creating general and helpful AI agents. By integrating the advanced capabilities of our Gemini models, SIMA is evolving from an instruction-foll…
deepmind.google
November 13, 2025 at 4:22 PM
Putin looks pale.
A humanoid robot powered by artificial intelligence, believed to be one of the first in Russia, face-planted during its highly anticipated debut in Moscow on Tuesday after briefly staggering onstage. nyti.ms/49Ly3GI
November 13, 2025 at 12:48 AM
Graziano doesn't pull any punches:

"The question is tricky. If it means: What would convince me that AI has a magical essence of experience emerging from its inner processes? Then nothing would convince me. Such a thing does not exist. Nor do humans have it."
November 12, 2025 at 12:25 AM
Are you a famous scientist?

Good news! I'm planning to launch a new journal and yearly conferences in the field of the most famous candidate. Friendly peer review guaranteed, executive positions available.

This is the blueprint I'm going to follow. In the name of God, they got Susskind and Witten.
Opening session of The 4th International Conference on Holography and its Applications
YouTube video by Journal of Holography Applications in Physics
www.youtube.com
November 6, 2025 at 5:40 PM
Kimi K2 Thinking and announcement tech blog is now live.
Kimi K2 Thinking
Kimi K2 Thinking, Moonshot's best open-source thinking model.
moonshotai.github.io
November 6, 2025 at 3:20 PM