Website: javirando.com
More info here cohere.com/events/coher...
More info here cohere.com/events/coher...
Join my oral presentation on Saturday at 4:30 pm to learn more.
Say hi if you want to chat about ML privacy and security
(or speciality ☕)
Say hi if you want to chat about ML privacy and security
(or speciality ☕)
The competition is hosted at SaTML 2025 and has a pool of $10k in prizes! What are you waiting for?
Want to learn how an indirect / cross prompt injection attack works? Want to try something different to an advent of code?
Then, I have a challenge for you!
The LLMail-Inject competition (llmailinject.azurewebsites.net) starts at 11am UTC (that's in 5min!)
The competition is hosted at SaTML 2025 and has a pool of $10k in prizes! What are you waiting for?
I will be presenting two (spotlight!) works. Come say hi to our posters.
I will be presenting two (spotlight!) works. Come say hi to our posters.
Our results show that current unlearning methods for AI safety only obfuscate dangerous knowledge, just like standard safety training.
Here's what we found👇
Our results show that current unlearning methods for AI safety only obfuscate dangerous knowledge, just like standard safety training.
Here's what we found👇
We're hiring PhD students, postdocs (and faculty!)
After 7 amazing years at Google Brain/DM, I am joining OpenAI. Together with @xzhai.bsky.social and @giffmana.ai, we will establish OpenAI Zurich office. Proud of our past work and looking forward to the future.
We're hiring PhD students, postdocs (and faculty!)
Reply to this post with your user or other people you think should be included!
Reply to this post with your user or other people you think should be included!
After 7 amazing years at Google Brain/DM, I am joining OpenAI. Together with @xzhai.bsky.social and @giffmana.ai, we will establish OpenAI Zurich office. Proud of our past work and looking forward to the future.
📖 javirando.com/blog/2024/ja...
📖 javirando.com/blog/2024/ja...
Unfortunately, we show it's not robust...
arxiv.org/abs/2411.14834
Unfortunately, we show it's not robust...
arxiv.org/abs/2411.14834