Lightnews — Scholar-powered news

Geoffrey Irving

@girving.bsky.social

A perk of being an American living in London who is from Alaska is that frequently when talking about temperatures I can refer to just "40 below" with no qualifiers.

November 28, 2025 at 10:32 AM

Geoffrey Irving

@girving.bsky.social

Do you want to fund AI alignment research?

The AISI Alignment Team and I have reviewed >800 Alignment Project Applications from 42 countries, and we have ~100 that are very promising. Unfortunately, this means we have a £13-17M funding gap! Thread with details! 🧵

Geoffrey Irving @girving.bsky.social · Jul 30

I am very excited that AISI is announcing over £15M in funding for AI alignment and control, in partnership with other governments, industry, VCs, and philanthropists!

Here is a 🧵 about why it is important to bring more independent ideas and expertise into this space.

alignmentproject.aisi.gov.uk

The Alignment Project by AISI — The AI Security Institute

The Alignment Project funds groundbreaking AI alignment research to address one of AI’s most urgent challenges: ensuring advanced systems act predictably, safely, and for society’s benefit.

alignmentproject.aisi.gov.uk

November 27, 2025 at 6:25 PM

Geoffrey Irving

@girving.bsky.social

The UK AI Security Institute ran an Alignment Conference from 29-31 November in London! The goal was to gather a mix of people experienced in and new to alignment, and get into the details of novel approaches to alignment and related problems. Hopefully we helped create some new research bets! 🧵

November 13, 2025 at 5:00 PM

Reposted by Geoffrey Irving

Cas (Stephen Casper)

@scasper.bsky.social

🚨New paper🚨

From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems.

🧵🧵🧵

November 12, 2025 at 2:04 PM

Geoffrey Irving

@girving.bsky.social

There is a real chance that my most important positive contribution to the world will have been to say something wrong on the internet.

November 10, 2025 at 10:24 AM

Geoffrey Irving

@girving.bsky.social

The UK AISI Cyber Autonomous Systems Team is hiring propensity researchers to grow the science around whether models *are likely* to attempt dangerous behaviour, as opposed to whether they are capable of doing so. 🧵

job-boards.eu.greenhouse.io/aisi/jobs/47...

Research Scientist - CAST Propensity

London, UK

job-boards.eu.greenhouse.io

November 7, 2025 at 9:14 AM

Geoffrey Irving

@girving.bsky.social

Spooky:

import Batteries.Data.UInt

def danger : UInt64 := UInt64.ofNat UInt64.size - 1
theorem danger_eq_large : danger = 18446744073709551615 := by decide +kernel
theorem danger_eq_one : danger = 1 := by native_decide
theorem bad : False := by simpa using danger_eq_large.symm.trans danger_eq_one

October 31, 2025 at 10:04 PM

Reposted by Geoffrey Irving

Timothy Gowers

@wtgowers.bsky.social

the time it would have taken me would probably have been of order of magnitude an hour (an estimate that comes with quite wide error bars). So it looks as though we have entered the brief but enjoyable era where our research is greatly sped up by AI but AI still needs us. 3/3

October 31, 2025 at 7:25 PM

Reposted by Geoffrey Irving

Ben Brubaker

@benbenbrubaker.bsky.social

I published a new post on my rarely updated personal blog! It's a sequel of sorts to my Quanta coverage of the Busy Beaver game, focusing on a particularly fearsome Turing machine known by the awesome name Antihydra.

Why Busy Beaver Hunters Fear the Antihydra

In which I explore the biggest barrier in the busy beaver game. What is Antihydra, what is the Collatz conjecture, how are they connected, and what makes them so daunting?

benbrubaker.com

October 27, 2025 at 4:04 PM

Geoffrey Irving

@girving.bsky.social

Another strong transition from @matt-levine.bsky.social.

October 23, 2025 at 7:59 PM

Geoffrey Irving

@girving.bsky.social

New AISI report mapping cruxes for whether AI progress might be fast or slow towards systems near or beyond human-level at most cognitive tasks. The goal is not to resolve uncertainties but reflect them: we don't know how AI will go, and should plan accordingly!

www.aisi.gov.uk/research/und...

Understanding AI Trajectories: Mapping the Limitations of Current AI Systems

www.aisi.gov.uk

October 23, 2025 at 3:17 PM

Geoffrey Irving

@girving.bsky.social

New open source library from the UK AI Security Institute! ControlArena lowers the barrier to secure and reproducible AI control research, to boost work on blocking and detecting malicious actions in case AI models are misaligned. In use by researchers at GDM, Anthropic, Redwood, and MATS! 🧵

October 22, 2025 at 6:04 PM

Geoffrey Irving

@girving.bsky.social

There's a nice recent post by @tobyord.bsky.social on the efficiency of pretraining vs. RL, arguing that RL can learn at most 1 bit per episode given binary reward. It's right that RL is less efficient, but 1 bit is not actually a limit in practice. 🧵 on why:

www.tobyord.com/writing/inef...

The Extreme Inefficiency of RL for Frontier Models — Toby Ord

The new scaling paradigm for AI reduces the amount of information a model could learn per hour of training by a factor of 1,000 to 1,000,000. I explore what this means and its implications for scaling...

www.tobyord.com

October 16, 2025 at 8:53 AM

Geoffrey Irving

@girving.bsky.social

Is there Matt Levine but for pure mathematics?

October 1, 2025 at 5:30 PM

Geoffrey Irving

@girving.bsky.social

Ominous start to a Wikipedia page about a formula...

en.wikipedia.org/wiki/Fa%C3%A...

September 29, 2025 at 9:02 PM

Reposted by Geoffrey Irving

xenaproject.bsky.social

@xenaproject.bsky.social

Amongst the projects funded is my project www.renaissancephilanthropy.org/a-dataset-of... to create what in 2025 is a super-hard dataset of pairs (informal hard proof, formal statement) of recent results from top journals. The challenge for machine is to formalise the rest of the paper.

www.renaissancephilanthropy.org

September 18, 2025 at 8:25 AM

Geoffrey Irving

@girving.bsky.social

Cas is very good and you should hire him as faculty!

Cas (Stephen Casper) @scasper.bsky.social · Sep 4

📌📌📌
I'm excited to be on the faculty job market this fall. I just updated my website with my CV.
stephencasper.com

Stephen Casper

Visit the post for more.

stephencasper.com

September 4, 2025 at 12:38 PM

Geoffrey Irving

@girving.bsky.social

From near the end of Sleepwalkers, by Christopher Clark, as World War I starts.

An English traveller recalled the reaction in an Altai (Semipalatinsk) Cossack settlement when the 'blue flag' borne aloft by a rider and the noise of bugles playing the alarm brought news of mobilization. The Tsar had spoken, and the Cossacks, with their unique military calling and tradition, 'burned to fight the enemy'. But who was that enemy? Nobody knew. The mobilization telegram provided no details. Rumours abounded. At first everyone imagined that the war must be with China - 'Russia had pushed too far into Mongolia and China had declared war.
Then another rumour did the rounds: 'It is with England, with England. This view prevailed for some time.

> Only after four days did something like the truth come to us, and then nobody believed it.

August 23, 2025 at 3:40 PM

Reposted by Geoffrey Irving

Marcelo Mattar

@marcelomattar.bsky.social

I'm honored to serve as Expert Advisor for "The Alignment Project", an international initiative dedicated to ensuring AI systems are safe and beneficial. They are providing significant funding, compute, and collaboration opportunities for researchers---including those in cogsci/neuro. Please apply!

Geoffrey Irving @girving.bsky.social · Jul 30

I am very excited that AISI is announcing over £15M in funding for AI alignment and control, in partnership with other governments, industry, VCs, and philanthropists!

Here is a 🧵 about why it is important to bring more independent ideas and expertise into this space.

alignmentproject.aisi.gov.uk

The Alignment Project by AISI — The AI Security Institute

The Alignment Project funds groundbreaking AI alignment research to address one of AI’s most urgent challenges: ensuring advanced systems act predictably, safely, and for society’s benefit.

alignmentproject.aisi.gov.uk

August 20, 2025 at 5:54 PM

Geoffrey Irving

@girving.bsky.social

The correct mathematical definition is that one that makes the most intermediate lemmas happen to be true, along the way to the result you care about.

August 20, 2025 at 5:53 PM

Reposted by Geoffrey Irving

Cas (Stephen Casper)

@scasper.bsky.social

🧵 New paper from UK AISI x @eleutherai.bsky.social rai.bsky.social‬ that I led with @kyletokens.bsky.social y.social��:

Open-weight LLM safety is both important & neglected. But filtering dual-use knowledge from pre-training data improves tamper resistance *>10x* over post-training baselines.

August 12, 2025 at 11:45 AM

Geoffrey Irving

@girving.bsky.social

In Lie group theory, the Killing form tells you which elements correspond to functionals that annihilate a given subspace. It is named after Wilhelm Killing.

en.wikipedia.org/wiki/Killing...