Lightnews — Scholar-powered news

sakanaai.bsky.social

@sakanaai.bsky.social

The benchmark continues to reveal gaps between AI computation and human-like reasoning.

🔗 Blogpost: pub.sakana.ai/sudoku-gpt5/
📊 Leaderboard: pub.sakana.ai/sudoku/
📄 Report: arxiv.org/abs/2505.16135
💻 GitHub: github.com/SakanaAI/Sudoku-Bench

From GRPO to GPT-5: Sudoku Variants

pub.sakana.ai

November 11, 2025 at 8:07 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Our GRPO and "Thought Cloning" experiments (learning from expert solvers) show current methods struggle with spatial reasoning and creative insights humans use naturally.

November 11, 2025 at 8:07 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Unlike Chess or Go, these puzzles require understanding novel rules through meta-reasoning, then maintaining consistency over long reasoning chains.

November 11, 2025 at 8:06 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Learn more about our approach.

GitHub: github.com/SakanaAI/pet...
Online Technical Report: pub.sakana.ai/pdnca

GitHub - SakanaAI/petri-dish-nca

Contribute to SakanaAI/petri-dish-nca development by creating an account on GitHub.

github.com

November 5, 2025 at 12:28 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Petri Dish Neural Cellular Automata (PD-NCA) is a new ALife substrate that consists of a differentiable world where multiple NCA learn to self-replicate and grow via ongoing gradient descent. Every individual is constantly trying to grow, all the while learning to adapt and out-compete its neighbors

November 5, 2025 at 12:28 AM

sakanaai.bsky.social

@sakanaai.bsky.social

How the ‘Attention is all you need’ paper was born from freedom, not pressure:

October 24, 2025 at 1:34 PM

sakanaai.bsky.social

@sakanaai.bsky.social

There’s a fairly wide gulf in capabilities both among different LLMs and different linguistic specifications, with it being notably easier for systems to deal with settings that are commoner cross-linguistically than those that are rarer.

PDF arxiv.org/abs/2510.07591
Code github.com/SakanaAI/IASC

IASC: Interactive Agentic System for ConLangs

We present a system that uses LLMs as a tool in the development of Constructed Languages. The system is modular in that one first creates a target phonology for the language using an agentic approach ...

arxiv.org

October 10, 2025 at 4:58 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Our goals with IASC:

1/ We hope that these tools will be fun to use for creating artificially constructed languages.

2/ We are interested in exploring what LLMs ‘know’ about language—not what they know about any particular language, but how much they know about and understand linguistic concepts.

October 10, 2025 at 4:56 AM

sakanaai.bsky.social

@sakanaai.bsky.social

We are happy to announce the release of IASC, an Interactive Agentic System for ConLangs (Constructed Languages).

GitHub: github.com/SakanaAI/IASC

GitHub - SakanaAI/IASC: LLMs for Constructed Languages

LLMs for Constructed Languages. Contribute to SakanaAI/IASC development by creating an account on GitHub.

github.com

October 10, 2025 at 4:55 AM

sakanaai.bsky.social

@sakanaai.bsky.social

By making ShinkaEvolve open-source, our goal is to democratize access to advanced discovery tools. We envision it as a companion to help scientists and engineers, building efficient, nature-inspired systems to unlock the future of AI research.

GitHub Project: github.com/SakanaAI/Shi...

GitHub - SakanaAI/ShinkaEvolve

Contribute to SakanaAI/ShinkaEvolve development by creating an account on GitHub.

github.com

September 25, 2025 at 5:59 AM

sakanaai.bsky.social

@sakanaai.bsky.social

ShinkaEvolve's efficiency comes from three key innovations:

1) Adaptive parent sampling to balance exploration and exploitation.

2) Novelty-based rejection filtering to avoid redundant work.

3) A bandit-based LLM ensemble that dynamically picks the best model for the job.

September 25, 2025 at 5:59 AM

sakanaai.bsky.social

@sakanaai.bsky.social

3/ LLM Training: It discovered a novel load balancing loss for MoE models, improving model performance and perplexity.

September 25, 2025 at 5:58 AM

sakanaai.bsky.social

@sakanaai.bsky.social

2/ Competitive Programming: On ALE-Bench, it improved an existing agent's solution, turning a 5th place result into a 2nd place leaderboard rank for one task.

September 25, 2025 at 5:58 AM

sakanaai.bsky.social

@sakanaai.bsky.social

We applied ShinkaEvolve to a diverse set of hard problems:

1/ AIME Math Reasoning: It evolved sophisticated agentic scaffolds that significantly outperform strong baselines, discovering a Pareto frontier of solutions trading performance for efficiency.

September 25, 2025 at 5:58 AM

sakanaai.bsky.social

@sakanaai.bsky.social

On the classic circle packing optimization problem, ShinkaEvolve discovered a new state-of-the-art solution using only 150 samples. This is a massive leap in efficiency compared to previous methods that required thousands of evaluations.

September 25, 2025 at 5:57 AM

sakanaai.bsky.social

@sakanaai.bsky.social

Many evolutionary AI systems are powerful but act like brute-force engines, burning thousands of samples to find good solutions. This makes discovery slow and expensive. We took inspiration from the efficiency of nature. ‘Shinka’ (進化) is Japanese for evolution.

September 25, 2025 at 5:57 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news