Lightnews — Scholar-powered news

@floriantramer.bsky.social

Assistant professor of computer science at ETH Zürich. Interested in Security, Privacy and Machine Learning.
https://floriantramer.com
https://spylab.ai

Posts Replies Media Videos

Reposted

floriantramer.bsky.social

@floriantramer.bsky.social

This was an unfortunate mistake, sorry about that.

But the conclusions of our paper don't change drastically: there is significant gradient masking (as shown by the transfer attack) and the cifar robustness is at most in the 15% range. Still cool though!
We'll see if we can fix the full attack

December 12, 2024 at 4:38 PM

Reposted

Stanislav Fort

@stanislavfort.bsky.social

I discovered a fatal flaw in a paper by @floriantramer.bsky.social et al claiming to break our Ensemble Everything Everywhere defense. Due to a coding error they used attacks 20x above the standard 8/255. They confirmed this but the paper is already out & quoted on OpenReview. What should we do now?

December 12, 2024 at 4:29 PM

Reposted

Jakub Łucki

@jakublucki.bsky.social

🚨Unlearned hazardous knowledge can be retrieved from LLMs 🚨

Our results show that current unlearning methods for AI safety only obfuscate dangerous knowledge, just like standard safety training.

Here's what we found👇

December 6, 2024 at 5:47 PM

floriantramer.bsky.social

@floriantramer.bsky.social

Come do open AI with us in Zurich!
We're hiring PhD students, postdocs (and faculty!)

Javier Rando @javirandor.com · Dec 4

Zurich is a great place to live and do research. It became a slightly better one overnight! Excited to see OAI opening an office here with such a great starting team 🎉

Alexander Kolesnikov @kolesnikov.ch · Dec 4

Ok, it is yesterdays news already, but good night sleep is important.

After 7 amazing years at Google Brain/DM, I am joining OpenAI. Together with @xzhai.bsky.social and @giffmana.ai, we will establish OpenAI Zurich office. Proud of our past work and looking forward to the future.

December 4, 2024 at 1:49 PM

Reposted

Javier Rando

@javirandor.com

Full paper: arxiv.org/abs/2410.13722
Amazing collaboration with Yiming Zhang during our internships at Meta.

Grateful to have worked with Ivan, Jianfeng, Eric, Nicholas, @floriantramer.bsky.social and Daphne.

Persistent Pre-Training Poisoning of LLMs

Large language models are pre-trained on uncurated text datasets consisting of trillions of tokens scraped from the Web. Prior work has shown that: (1) web-scraped pre-training datasets can be practic...

arxiv.org

November 25, 2024 at 12:27 PM

floriantramer.bsky.social

@floriantramer.bsky.social

Ensemble Everything Everywhere is a defense against adversarial examples that people got quite exited about a few months ago (in particular, the defense causes "perceptually aligned" gradients just like adversarial training)

Unfortunately, we show it's not robust...

arxiv.org/abs/2411.14834

Gradient Masking All-at-Once: Ensemble Everything Everywhere Is Not Robust

Ensemble everything everywhere is a defense to adversarial examples that was recently proposed to make image classifiers robust. This defense works by ensembling a model's intermediate representations...

arxiv.org

November 25, 2024 at 8:38 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news