Rich Harang
banner
rich.harang.org
Rich Harang
@rich.harang.org
Using bad guys to catch math since 2010.
Principal Security Architect (AI/ML) and AI Red Team at NVIDIA.
He/him. Personal account etc; `from std_disclaimers import *`
Safe AI starts with Secure AI.
Choose your warrior.
July 11, 2025 at 11:28 PM
Meanwhile, on Twitter (not "X"; their words not mine)....

(From quick inspection: mostly crypto + telegram scams -- this is about a week's worth)
May 27, 2025 at 12:57 PM
Tapping the "Models give you what you ask for, not what you want" sign yet again.
April 30, 2025 at 4:50 PM
February 26, 2025 at 5:20 PM
I am begging AI Red Teams to stop killing themselves trying to prevent attacks that can be just as easily accomplished by editing client-side HTML.

For example:
February 26, 2025 at 2:10 PM
January 24, 2025 at 6:34 PM
Alternately:
January 23, 2025 at 2:09 PM
Without downloading new pictures/videos where are you mentally?
January 23, 2025 at 2:05 PM
Today's mood.
January 10, 2025 at 3:28 PM
An arcane tome filled with occult knowledge about the true workings of the world, that causes madness and despair in all who pursue its dark secrets? Yeah we've got one in the back.
January 8, 2025 at 6:02 PM
Apropos of the "models pretending to escape from their server" thing:

(from transformer-circuits.pub/2024/scaling...)
December 7, 2024 at 12:48 PM
November 23, 2024 at 7:09 PM
November 18, 2024 at 3:53 PM
Never change, "Answer with AI" features.
November 11, 2024 at 3:22 PM
Never change, Microsoft. You're doing great.
November 9, 2024 at 3:04 PM
November 7, 2024 at 8:16 PM
The team's response to _every_ LLM security finding this week:
October 29, 2024 at 5:56 PM
The good Confluence (at Harper's Ferry).
October 21, 2024 at 12:04 PM
Bucket list item complete: finally saw the northern lights in person. Had about ten minutes when you could see the green ribbons with the naked eye, no long exposure photo needed.
November 27, 2023 at 10:42 PM
October 24, 2023 at 5:29 PM
October 9, 2023 at 8:37 PM
So remember the "mango pudding" LLM backdooring attack? How safe do you feel using these models now?
July 3, 2023 at 1:40 PM
PS I had to see this so now you do
May 3, 2023 at 5:07 PM
So, uh, that langchain vuln is pretty bad.

(https://nvd.nist.gov/vuln/detail/CVE-2023-29374)
May 2, 2023 at 11:10 PM
permanently linked in my brain to
April 28, 2023 at 4:48 PM