Lightnews — Scholar-powered news

Steve Byrnes

@stevebyrnes.bsky.social

New blog post! “Social drives 2: ‘Approval Reward’, from norm-enforcement to status-seeking”. I try to explain the path from an RL reward function in the brain, to deep truths about the human psyche… www.lesswrong.com/posts/fPxgFH... (1/6)

Social drives 2: “Approval Reward”, from norm-enforcement to status-seeking — LessWrong

…Approval Reward is a brain signal that leads to: • Pleasure (positive reward) when my friends and idols seem to have positive feelings about me, or about something related to me, or about what I’m do...

www.lesswrong.com

November 12, 2025 at 9:01 PM

Steve Byrnes

@stevebyrnes.bsky.social

New blog post! “Social drives 1: ‘Sympathy Reward’, from compassion to dehumanization”. This is the 1st of 2 posts building an ever-better bridge that connects from neuroscience & algorithms on one shore, to everyday human experience on the other… www.lesswrong.com/posts/KuBiv9... (1/5)

November 10, 2025 at 3:27 PM

Steve Byrnes

@stevebyrnes.bsky.social

Blog post: “Excerpts from my neuroscience to-do list” www.lesswrong.com/posts/c6Job6...

www.lesswrong.com

October 7, 2025 at 12:26 AM

Steve Byrnes

@stevebyrnes.bsky.social

“The human niche” includes living on every continent, walking on the moon, inventing computers & nuclear weapons, and unraveling the secrets of the universe. We need a term for AI systems that can occupy this “niche”. (1/6)

Sam Gershman @gershbrain.bsky.social · Sep 28

I had always assumed that "general intelligence" refers to intelligent systems that show efficiency on essentially all tasks (that's the "general" part). This is why I also always assumed that (by the NFL theorems) general intelligence is impossible.

September 28, 2025 at 11:33 AM

Steve Byrnes

@stevebyrnes.bsky.social

Just read the new book “If Anyone Builds It, Everyone Dies”. Upshot: Recommended! I ~90% agree with it. Thread: ifanyonebuildsit.com

If Anyone Builds It, Everyone Dies

The race to superhuman AI risks extinction, but it's not too late to change course.

ifanyonebuildsit.com

September 18, 2025 at 7:40 PM

Steve Byrnes

@stevebyrnes.bsky.social

Blog post: Optical rectennas are not a promising clean energy technology www.lesswrong.com/posts/gKCavz...

Optical rectennas are not a promising clean energy technology — LessWrong

“Optical rectennas” (or sometimes “nantennas”) are a technology that is sometimes advertised as a path towards converting solar energy to electricity with higher efficiency than normal solar cells...

www.lesswrong.com

September 11, 2025 at 11:35 PM

Steve Byrnes

@stevebyrnes.bsky.social

Clarification: When I shared this meme 2 years ago, I was referring specifically to traditional task-based fMRI studies.

“Functional Connectomics” fMRI studies, by contrast, would be flying overhead in a helicopter, strafing the water with a machine gun

August 31, 2025 at 8:46 PM

Steve Byrnes

@stevebyrnes.bsky.social

Blog post: “Neuroscience of human sexual attraction triggers (3 hypotheses)” www.lesswrong.com/posts/ktydLo...

August 25, 2025 at 7:42 PM

Steve Byrnes

@stevebyrnes.bsky.social

Blog post: “Four ways learning Econ makes people dumber re: future AI” www.alignmentforum.org/posts/xJWBof...

Four ways learning Econ makes people dumber re: future AI — AI Alignment Forum

There’s a funny thing where economics education paradoxically makes people DUMBER at thinking about future AI. Econ textbooks teach concepts & frames that are great for most things, but counterproduct...

www.alignmentforum.org

August 21, 2025 at 7:03 PM

Steve Byrnes

@stevebyrnes.bsky.social

Uploaded a new PDF version of ↓, with various minor changes accumulated over the last 5 months—a few new paragraphs, new references, typo fixes, etc. See the alignment forum (blog) version for detailed changelogs at the bottom of each post.

Steve Byrnes @stevebyrnes.bsky.social · Mar 22

By popular demand, “Intro to brain-like AGI safety” is now also available as an easily citable & printable 200-page PDF preprint! Link & highlights in thread 🧵 1/13

August 12, 2025 at 2:02 AM

Steve Byrnes

@stevebyrnes.bsky.social

If you too would like to be falsely accused of AI ghostwriting from how effortlessly and fluently you can touch-type em dashes and other unicode glyphs… then check out my handy guide!
[It’s from a decade ago, but I keep it updated.] sjbyrnes.com/unicode.html

Touch-Typing Unicode: How and Why

Let’s say I want to type the character μ. I look up the shortcut on the cheat sheet, and see that it’s “[compose key] * m”. So if the compose key is Right-Alt (for example), I would press and release Right-Alt, then *, then m. And μ appears!

sjbyrnes.com

August 8, 2025 at 4:52 PM

Steve Byrnes

@stevebyrnes.bsky.social

I went on a podcast! lironshapira.substack.com/p/the-man-wh...

The Man Who Might SOLVE AI Alignment — Dr. Steven Byrnes, AGI Safety Researcher @ Astera Institute

Dr. Steven Byrnes, UC Berkeley physics PhD and Harvard physics postdoc, is an AI safety researcher at the Astera Institute, focused on solving the technical AI alignment problem.

lironshapira.substack.com

August 6, 2025 at 1:22 PM

Steve Byrnes

@stevebyrnes.bsky.social

New blog post: “The perils of under- vs over-sculpting AGI desires”. (1/5) www.alignmentforum.org/posts/grgb2i...

August 5, 2025 at 6:28 PM

Steve Byrnes

@stevebyrnes.bsky.social

New blog post: “Behaviorist” RL Reward Functions Lead To Scheming. I argue that, if RL is used to push AI capabilities towards AGI, it will eventually lead to AI that “schemes” (feigns niceness, while looking for a chance for escape, world takeover, etc.) (1/3) www.alignmentforum.org/posts/FNJF3S...

“Behaviorist” RL reward functions lead to scheming — AI Alignment Forum

I will argue that a large class of reward functions, which I call “behaviorist”, and which includes almost every reward function in the RL and LLM literature, are all doomed to eventually lead to AI t...

www.alignmentforum.org

July 23, 2025 at 5:48 PM

Steve Byrnes

@stevebyrnes.bsky.social

New 2-post series on “foom & doom” scenarios, where radical superintelligence arises seemingly out of nowhere and wipes out humanity. These were often discussed a decade ago, but are now widely dismissed due to LLMs. …Well call me old fashioned, but I’m still expecting foom & doom 🧵 (1/10)

June 23, 2025 at 6:46 PM

Steve Byrnes

@stevebyrnes.bsky.social

I started an announcements mailing list on substack. It will basically just be links to new blog posts when I publish them, very similar to following me on bluesky, but in the comfort of your own email inbox. stevebyrnes1.substack.com

Steve Byrnes’s Substack | Substack

Mailing list for announcements of new blog posts and other works. Click to read Steve Byrnes’s Substack, a Substack publication. Launched 33 minutes ago.

stevebyrnes1.substack.com

May 22, 2025 at 8:00 PM

Steve Byrnes

@stevebyrnes.bsky.social

Blog post: “Reward button alignment” www.alignmentforum.org/posts/JrTk2p...

For RL agents (incl “brain-like AGI”), there’s a reward function, with huge effect on what the AI winds up wanting to do.

One option is: hook reward to a physical button. And then the AI wants you to press the button. 1/2

Reward button alignment — AI Alignment Forum

In the context of model-based RL agents in general, and brain-like AGI in particular, part of the source code is a reward function. The programmers get to put whatever code they want into the reward f...

www.alignmentforum.org

May 22, 2025 at 7:18 PM

Steve Byrnes

@stevebyrnes.bsky.social

Blog post: “Re @slimemoldtimemold.bsky.social : negative feedback on negative feedback” www.lesswrong.com/posts/LfqFZv...

Re SMTM: negative feedback on negative feedback — LessWrong

SlimeMoldTimeMold (SMTM) recently finished a 13-part series on psychology, “The Mind in the Wheel”, centering on feedback loops (“cybernetics”) as ki…

www.lesswrong.com

May 14, 2025 at 8:37 PM

Steve Byrnes

@stevebyrnes.bsky.social

New blog post: “‘The Era of Experience’ has an Unsolved Technical Alignment Problem” (1/4) 🧵 www.alignmentforum.org/posts/TCGgiJ...

“The Era of Experience” has an unsolved technical alignment problem — AI Alignment Forum

Every now and then, some AI luminaries (1) propose that the future of powerful AI will be reinforcement learning agents—an algorithm class that in many ways has more in common with MuZero (2019) than ...

www.alignmentforum.org

April 24, 2025 at 2:12 PM

Steve Byrnes

@stevebyrnes.bsky.social

that must have been a fun experiment

Box 3.2 Brainless conditioning Pavlovian conditioning occurs in a naturally brainless species, sea anemones, but it is also possible to study protostomes that have had their brains removed. An experiment by Horridge[130] demonstrated response–outcome conditioning in decapitated cockroaches and locusts. Subsequent studies showed that either the ventral nerve cord[131,132] or an isolated peripheral ganglion[133] suffices to acquire and retain these memories.

April 16, 2025 at 1:44 AM

Reposted by Steve Byrnes

Convergent Research

@convergentresearch.bsky.social

we made a map!

gap-map.org is a tool we built to help you explore the landscape of R&D gaps holding back science - and the bridge-scale fundamental development efforts that might allow humanity to solve them, across almost two dozen fields

The Gap Map

Explore R&D Gaps and their related Foundational Capabilities.

gap-map.org

April 15, 2025 at 1:17 PM

Steve Byrnes

@stevebyrnes.bsky.social

Couple updates to my old post summarizing the technical alignment problem for brain-like AGI (or more generally, actor-critic model-based reinforcement learning AGI): 🧵(1/3) www.alignmentforum.org/posts/wucncP...

[Intro to brain-like-AGI safety] 10. The alignment problem — AI Alignment Forum

In this post, I discuss the alignment problem for brain-like AGIs—i.e., the problem of making an AGI that’s trying to do some particular thing that the AGI designers had intended for it to be trying t...

www.alignmentforum.org

April 13, 2025 at 2:22 AM

Reposted by Steve Byrnes

Future of Life Institute

@futureoflife.org

📺 📻 New on the FLI Podcast: @asterainstitute.bsky.social artificial general intelligence (AGI) safety researcher @stevebyrnes.bsky.social joins for a discussion diving into the hot topic of AGI, including different paths to it - and why brain-like AGI would be dangerous. 🧵👇

April 4, 2025 at 8:36 PM

Steve Byrnes

@stevebyrnes.bsky.social

By popular demand, “Intro to brain-like AGI safety” is now also available as an easily citable & printable 200-page PDF preprint! Link & highlights in thread 🧵 1/13

March 22, 2025 at 3:31 PM

Steve Byrnes

@stevebyrnes.bsky.social

I have a revised and improved talk introducing my research to a general audience: “Challenges for Safe & Beneficial Brain-Like Artificial General Intelligence”. Thanks to Mila AI Safety Reading Group for the invitation! youtu.be/IXi96sRMKUI

"Challenges for Safe & Beneficial Brain-Like Artificial General Intelligence" talk by Steven Byrnes

YouTube video by Steve Byrnes

youtu.be

March 20, 2025 at 7:42 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news