Yarin
banner
yaringal.bsky.social
Yarin
@yaringal.bsky.social
Associate Professor of Machine Learning, University of Oxford;

OATML Group Leader;

Director of Research at the UK government's AI Safety Institute (formerly UK Taskforce on Frontier AI)
Hot take: I think we just demonstrated the first AI agent computer worm 🤔

When an agent sees a trigger image it's instructed to execute malicious code and then share the image on social media to trigger other users' agents

This is a chance to talk about agent security 👇
⚠️ Beware: Your AI assistant could be hijacked just by encountering a malicious image online!

Our latest research exposes critical security risks in AI assistants. An attacker can hijack them by simply posting an image on social media and waiting for it to be captured. [1/6] 🧵
March 20, 2025 at 2:28 PM
Reposted by Yarin
Very interesting paper about unlearning for AI Safety, a subject that deserves more attention. ⬇️

🚨 New Paper Alert: Open Problem in Machine Unlearning for AI Safety 🚨

Can AI truly "forget"? While unlearning promises data removal, controlling emergent capabilities is a inherent challenge. Here's why it matters: 👇

Paper: arxiv.org/pdf/2501.04952
1/8
January 11, 2025 at 3:11 PM
Reposted by Yarin
Thanks to my amazing collaborators:

@TingchenFu @AmyPrb @StephenLCasper

@AmartyaSanyal @Adel_Bibi @aidanogara_ @_robertkirk @ben_s_bucknall @fiiiiiist Luke Ong @philiptorr Kwok-Yan Lam @RobertTrager

@DavidSKrueger @sorenmind José Hernández-Orallo @megamor2.bsky.social @yaringal.bsky.social
January 10, 2025 at 4:58 PM
Excited about our most recent work on the challenges we face when using unlearning method for safe and secure AI! Work done collaboratively by a great team & led by @fbarez.bsky.social

🚨 New Paper Alert: Open Problem in Machine Unlearning for AI Safety 🚨

Can AI truly "forget"? While unlearning promises data removal, controlling emergent capabilities is a inherent challenge. Here's why it matters: 👇

Paper: arxiv.org/pdf/2501.04952
1/8
January 10, 2025 at 5:06 PM
Reposted by Yarin
On the eighth day of Christmas, RobOx gave to us: coverage in eight of the top papers for Associate Professor Yarin Gal’s research projects. @yaringal.bsky.social

#CompSciOxford #12DaysOfChristmas #Oxmas
December 8, 2024 at 10:36 AM
Reposted by Yarin
I look forward to co-directing the Canadian AI Safety Institute (CAISI) Research Program at CIFAR with @catherineregis.bsky.social

We will be designing the program in the coming months and will soon share ways to get involved with this new community.

Read more here: cifar.ca/cifarnews/20...
December 12, 2024 at 7:36 PM
I'm looking for PhD applicants who have expertise in Gaussian processes and/or Transformers for an exciting PhD project

If this sounds interesting, application deadline for funding is 3/12

Please share with people you think this might be relevant to!

oatml.cs.ox.ac.uk/apply.html
November 30, 2024 at 2:42 PM
Reposted by Yarin
Welcome to the Crazy Rich Bayesian Starter Pack, folk who are/were vaguely into Bayesian reasoning but - with a few exceptions - don't shun the non-Bayesian.
go.bsky.app/JYH5Z6M
November 25, 2024 at 12:08 PM
Reposted by Yarin
brew install mactop
github.com/context-labs...
November 23, 2024 at 8:02 PM
Reposted by Yarin
The International Society for Bayesian Analysis (ISBA) has joined Bluesky. You can follow the account at @isba-bayesian.bsky.social to stay updated on events, publications, and discussions within the #Bayesian community.

Please add the account to your starter packages.
November 23, 2024 at 10:25 PM
Reposted by Yarin
Now that @jeffclune.bsky.social and @joelbot3000.bsky.social are here, time for an Open-Endedness starter pack.

go.bsky.app/MdVxrtD
November 20, 2024 at 7:08 AM
Reposted by Yarin
On my way to Oxford to meet amazing people and give a talk on the opportunities of AI to accelerate progress in environmental modeling.
November 20, 2024 at 8:35 AM
Reposted by Yarin
📣 We have a tenure-track faculty opening in Responsible AI at @ethzurich.bsky.social :
ethz.ch/en/the-eth-z.... Deadline Nov 30 for full consideration. ETH Zurich is a vibrant environment for AI research with the ETH AI Center etc. Please help spread the word!
Assistant Professor (Tenure Track) of Computer Science – Responsible Artificial Intelligence
ethz.ch
November 20, 2024 at 8:31 AM
Reposted by Yarin
Some machine learners were once children. Here’s where you can find them:

go.bsky.app/F6mM37U
November 19, 2024 at 11:31 PM
Reposted by Yarin
I don’t need to go on social media to have my worldview challenged I am in theoretical physics I have a new existential crisis daily
November 18, 2024 at 11:43 PM
Reposted by Yarin
Since this platform is finally attracting a critical mass of ML researchers, here's our recent work on prompt-based vulnerabilities of coding assistants:

arxiv.org/abs/2407.11072

TL;DR — An attacker can convince your favorite LLM to suggest vulnerable code with just a minor change to the prompt!
MaPPing Your Model: Assessing the Impact of Adversarial Attacks on LLM-based Programming Assistants
LLM-based programming assistants offer the promise of programming faster but with the risk of introducing more security vulnerabilities. Prior work has studied how LLMs could be maliciously fine-tuned...
arxiv.org
November 17, 2024 at 11:41 PM
Reposted by Yarin
Hey, this Friday I'm the Keynote speaker at the 20th AAAI Conference on AI and Interactive Digital Entertainment (AIIDE), the best conference on AI and Games sites.google.com/gcloud.utah....

I think I will talk about why the next big challenge in AI game playing should be Dungeons and Dragons 🧙🐉
November 19, 2024 at 3:24 AM
Reposted by Yarin
November 19, 2024 at 3:48 AM
Reposted by Yarin
Hey! @friedler.net made a FAccT starter pack: bsky.app/starter-pack...
November 19, 2024 at 3:52 AM
Reposted by Yarin
Hope I'm the first to post this all time classic on this platform
November 19, 2024 at 4:51 AM
Reposted by Yarin
Hey, @bsky.app @support.bsky.team, is there a way for you to shorten the displayed usernames when trailed by “bsky.social”? If someone has some other domain name, then fine, show that, but if we're using the default domain, can we get rid of these lengthy string of characters?
November 18, 2024 at 8:29 PM
Reposted by Yarin
I've created an initial Grumpy Machine Learners starter park. If you think you're grumpy and you "do machine learning", nominate yourself. If you're on the list, but don't think you are grumpy, then take a look in the mirror.

go.bsky.app/6ddpivr
November 18, 2024 at 2:40 PM
Reposted by Yarin
Google DeepMind is hiring Student Researchers in EMEA 👇
Student Researcher positions in EMEA now accepting applications!

Please repost.

www.google.com/about/career...
Student Researcher, 2025 — Google Careers
www.google.com
November 18, 2024 at 12:27 PM
Reposted by Yarin
📣 Last call for the Ph.D. and Postdoc Fellowships at the ETH AI Center -- Deadline Nov 19 '24 t.co/aYI5tWXUWK @ethzurich.bsky.social
https://ai.ethz.ch/education/phd-and-postdoc-programs.html
t.co
November 18, 2024 at 10:52 AM
Reposted by Yarin
I’m keen to dig more into safety cases, there’s something ‘proving a negative’ about them but equally it’s good to see a really concrete attempt to tether speculation. Here’s a new piece from UK AISI @girving.bsky.social and gov AI attempting to provide a template

arxiv.org/abs/2411.08088
Safety case template for frontier AI: A cyber inability argument
Frontier artificial intelligence (AI) systems pose increasing risks to society, making it essential for developers to provide assurances about their safety. One approach to offering such assurances is...
arxiv.org
November 17, 2024 at 2:18 PM