Here is a 🧵 about why it is important to bring more independent ideas and expertise into this space.
alignmentproject.aisi.gov.uk
The AISI Alignment Team and I have reviewed >800 Alignment Project Applications from 42 countries, and we have ~100 that are very promising. Unfortunately, this means we have a £13-17M funding gap! Thread with details! 🧵
The AISI Alignment Team and I have reviewed >800 Alignment Project Applications from 42 countries, and we have ~100 that are very promising. Unfortunately, this means we have a £13-17M funding gap! Thread with details! 🧵
Here is a 🧵 about why it is important to bring more independent ideas and expertise into this space.
alignmentproject.aisi.gov.uk
The AISI Alignment Team and I have reviewed >800 Alignment Project Applications from 42 countries, and we have ~100 that are very promising. Unfortunately, this means we have a £13-17M funding gap! Thread with details! 🧵
From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems.
🧵🧵🧵
From a technical perspective, safeguarding open-weight model safety is AI safety in hard mode. But there's still a lot of progress to be made. Our new paper covers 16 open problems.
🧵🧵🧵
job-boards.eu.greenhouse.io/aisi/jobs/47...
job-boards.eu.greenhouse.io/aisi/jobs/47...
import Batteries.Data.UInt
def danger : UInt64 := UInt64.ofNat UInt64.size - 1
theorem danger_eq_large : danger = 18446744073709551615 := by decide +kernel
theorem danger_eq_one : danger = 1 := by native_decide
theorem bad : False := by simpa using danger_eq_large.symm.trans danger_eq_one
import Batteries.Data.UInt
def danger : UInt64 := UInt64.ofNat UInt64.size - 1
theorem danger_eq_large : danger = 18446744073709551615 := by decide +kernel
theorem danger_eq_one : danger = 1 := by native_decide
theorem bad : False := by simpa using danger_eq_large.symm.trans danger_eq_one
www.aisi.gov.uk/research/und...
www.aisi.gov.uk/research/und...
www.tobyord.com/writing/inef...
www.tobyord.com/writing/inef...
I'm excited to be on the faculty job market this fall. I just updated my website with my CV.
stephencasper.com
Here is a 🧵 about why it is important to bring more independent ideas and expertise into this space.
alignmentproject.aisi.gov.uk
Open-weight LLM safety is both important & neglected. But filtering dual-use knowledge from pre-training data improves tamper resistance *>10x* over post-training baselines.
Open-weight LLM safety is both important & neglected. But filtering dual-use knowledge from pre-training data improves tamper resistance *>10x* over post-training baselines.
en.wikipedia.org/wiki/Killing...
en.wikipedia.org/wiki/Killing...