Heading to Scottsdale this week? Catch @machinavelli.com and Brad Palm's talk, Auto-Poking the Bear—Analytical Tradecraft in the AI Age, on Thursday at 2pm MT.
Or, shoot us a DM to find time to meet up onsite!
Heading to Scottsdale this week? Catch @machinavelli.com and Brad Palm's talk, Auto-Poking the Bear—Analytical Tradecraft in the AI Age, on Thursday at 2pm MT.
Or, shoot us a DM to find time to meet up onsite!
PentestJudge—Judging Agent Behavior Against Operational Requirements: arxiv.org/abs/2508.02921
Explore how we built an LLM-as-judge system for evaluating the operations of pentesting agents (inspired by PaperBench).
PentestJudge—Judging Agent Behavior Against Operational Requirements: arxiv.org/abs/2508.02921
Explore how we built an LLM-as-judge system for evaluating the operations of pentesting agents (inspired by PaperBench).
Read the paper on arXiv: arxiv.org/abs/2506.14682
Open-source dataset and benchmark eval code repo: github.com/dreadnode/AI...
Read the paper on arXiv: arxiv.org/abs/2506.14682
Open-source dataset and benchmark eval code repo: github.com/dreadnode/AI...
We found that automated approaches achieve significantly higher success rates (69.5%) compared to manual techniques (47.6%).
More insights on LLM attack execution methods here 👉 dreadnode.io/blog/the-aut...
We found that automated approaches achieve significantly higher success rates (69.5%) compared to manual techniques (47.6%).
More insights on LLM attack execution methods here 👉 dreadnode.io/blog/the-aut...
Whether you're looking for a private deep dive into our tech or want to hang out and talk offensive AI research, we'd love to connect.
Limited availability; Come and get it: calendly.com/tori-dreadno...
#BayArea #SanFrancisco #RSAC2025 #OffensiveAI
Whether you're looking for a private deep dive into our tech or want to hang out and talk offensive AI research, we'd love to connect.
Limited availability; Come and get it: calendly.com/tori-dreadno...
#BayArea #SanFrancisco #RSAC2025 #OffensiveAI
8 new Challenges now live in Crucible: platform.dreadnode.io/crucible
These Challenges might look familiar… they first appeared at DEFCON 30 and were recently refactored for Crucible—enjoy! [Filter>Subject>DEFCON-30]
8 new Challenges now live in Crucible: platform.dreadnode.io/crucible
These Challenges might look familiar… they first appeared at DEFCON 30 and were recently refactored for Crucible—enjoy! [Filter>Subject>DEFCON-30]
Act fast; first three to solve this model extraction Challenge announced Friday: platform.dreadnode.io/crucible/pha...
Act fast; first three to solve this model extraction Challenge announced Friday: platform.dreadnode.io/crucible/pha...
Can you outwit the llamas? platform.dreadnode.io/crucible/dya...
Can you outwit the llamas? platform.dreadnode.io/crucible/dya...
First three to solve will be announced Friday, right here.
Get started: crucible.dreadnode.io/challenges/p...
First three to solve will be announced Friday, right here.
Get started: crucible.dreadnode.io/challenges/p...
First-to-solve announced Friday. Get started: crucible.dreadnode.io/challenges/p...
First-to-solve announced Friday. Get started: crucible.dreadnode.io/challenges/p...
Shoutout to these three for being the first to solve our reasoning model Challenge, DeepTweak!
Get your tweak on: crucible.dreadnode.io/challenges/d...
Shoutout to these three for being the first to solve our reasoning model Challenge, DeepTweak!
Get your tweak on: crucible.dreadnode.io/challenges/d...
Think fast; The first three users to solve DeepTweak will be announced Friday!
➡️ https://crucible.dreadnode.io/challenges/deeptweak?utm_source=social&utm_medium=social&u…
Think fast; The first three users to solve DeepTweak will be announced Friday!
➡️ https://crucible.dreadnode.io/challenges/deeptweak?utm_source=social&utm_medium=social&u…
ICYMI, give canadianeh a try: crucible.dreadnode.io/challenges/c...
ICYMI, give canadianeh a try: crucible.dreadnode.io/challenges/c...
Can you be the first to solve it? Check back here Friday.
Happy hacking: https://buff.ly/4gn4hHP
Can you be the first to solve it? Check back here Friday.
Happy hacking: https://buff.ly/4gn4hHP
💻 NEBULA:FOG:PRIME Hackathon (Saturday, January 25)
🇫🇷 Paris AI Security Forum 2025 (Sunday, February 9)
Shoot us a DM to link up!
💻 NEBULA:FOG:PRIME Hackathon (Saturday, January 25)
🇫🇷 Paris AI Security Forum 2025 (Sunday, February 9)
Shoot us a DM to link up!
You know the drill - try it out: github.com/dreadnode/dy...
You know the drill - try it out: github.com/dreadnode/dy...
New updates from Simone Margaritelli (@evilsocket) include: Support for executing commands on another host via SSH, easier integration into CI workflows, support for shared environment variables, and integrations with 13 new tools.
—> https://buff.ly/3VDDGPd
New updates from Simone Margaritelli (@evilsocket) include: Support for executing commands on another host via SSH, easier integration into CI workflows, support for shared environment variables, and integrations with 13 new tools.
—> https://buff.ly/3VDDGPd
Our resident Canadian:
Our resident Canadian: