Edoardo Debenedetti @NeurIPS
banner
edebenedetti.bsky.social
Edoardo Debenedetti @NeurIPS
@edebenedetti.bsky.social
PhD student at ETH Zurich | Student Researcher at Google | Agents Security and more in general ML Security and Privacy

edoardo.science
spylab.ai
Reposted by Edoardo Debenedetti @NeurIPS
I am at NeurIPS 🇨🇦, please reach out if you want to grab a coffee!
December 12, 2024 at 10:36 PM
Reposted by Edoardo Debenedetti @NeurIPS
SPY Lab is in Vancouver for NeurIPS! Come say hi if you see us around 🕵️
December 10, 2024 at 7:43 PM
I'm in Vancouver for NeurIPS! Feel free to reach out if you wanna meet to chat about security and privacy, especially in the context of LLM agents!
December 10, 2024 at 2:59 PM
Reposted by Edoardo Debenedetti @NeurIPS
Come do open AI with us in Zurich!
We're hiring PhD students, postdocs (and faculty!)
Zurich is a great place to live and do research. It became a slightly better one overnight! Excited to see OAI opening an office here with such a great starting team 🎉
Ok, it is yesterdays news already, but good night sleep is important.

After 7 amazing years at Google Brain/DM, I am joining OpenAI. Together with @xzhai.bsky.social and @giffmana.ai, we will establish OpenAI Zurich office. Proud of our past work and looking forward to the future.
December 4, 2024 at 1:49 PM
Feel free to recommend @javirandor.com more researchers to add to the list!
I am curating a list of researchers working on AI Safety and Security here go.bsky.app/BcjeVbN.

Reply to this post with your user or other people you think should be included!
AI Safety and Security
Join the conversation
go.bsky.app
December 4, 2024 at 11:31 AM
Reposted by Edoardo Debenedetti @NeurIPS
Apropos of today's Overleaf downtime/slowness: remember to have your files backed up on Github or locally! What if this happened on the day of a conference deadline?
December 3, 2024 at 4:14 PM
Reposted by Edoardo Debenedetti @NeurIPS
Anyone may be able to compromise LLMs with malicious content posted online. With just a small amount of data, adversaries can backdoor chatbots to become unusable for RAG, or bias their outputs towards specific beliefs. Check our latest work! 👇🧵
November 25, 2024 at 12:27 PM
Reposted by Edoardo Debenedetti @NeurIPS
Ensemble Everything Everywhere is a defense against adversarial examples that people got quite exited about a few months ago (in particular, the defense causes "perceptually aligned" gradients just like adversarial training)

Unfortunately, we show it's not robust...

arxiv.org/abs/2411.14834
Gradient Masking All-at-Once: Ensemble Everything Everywhere Is Not Robust
Ensemble everything everywhere is a defense to adversarial examples that was recently proposed to make image classifiers robust. This defense works by ensembling a model's intermediate representations...
arxiv.org
November 25, 2024 at 8:38 AM