Lightnews — Scholar-powered news

Maksym Andriushchenko

@maksym-andr.bsky.social

370 followers 270 following 26 posts

Faculty at ‪the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems. Leading the AI Safety and Alignment group. PhD from EPFL supported by Google & OpenPhil PhD fellowships.

More details: https://www.andriushchenko.me/

Posts Replies Media Videos

Maksym Andriushchenko

@maksym-andr.bsky.social

📣 Incredibly excited to participate in writing the International AI Safety Report, chaired by Yoshua Bengio, as chapter lead for the capabilities chapter!

⚖️ AI is progressing so rapidly that yearly updates are no longer sufficient.

1/3

Yoshua Bengio @yoshuabengio.bsky.social · Oct 15

AI is evolving too quickly for an annual report to suffice. To help policymakers keep pace, we're introducing the first Key Update to the International AI Safety Report. 🧵⬇️

(1/10)

October 15, 2025 at 12:24 PM

Reposted by Maksym Andriushchenko

Johannes Zenn

@johanneszenn.bsky.social

A new recording of our FridayTalks@Tübingen series is online!

AI Safety and Alignment
by
@maksym-andr.bsky.social

Watch here: youtu.be/7WRW8MDQ8bk

AI Safety and Alignment - [Maksym Andriushchenko]

YouTube video by Friday Talks Tübingen

youtu.be

September 11, 2025 at 1:18 PM

Maksym Andriushchenko

@maksym-andr.bsky.social

🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨

1/n

August 6, 2025 at 3:43 PM

Maksym Andriushchenko

@maksym-andr.bsky.social

🚨Excited to release OS-Harm! 🚨

The safety of computer use agents has been largely overlooked.

We created a new safety benchmark based on OSWorld for measuring 3 broad categories of harm:
1. deliberate user misuse,
2. prompt injections,
3. model misbehavior.

June 19, 2025 at 3:28 PM

Maksym Andriushchenko

@maksym-andr.bsky.social

🚨Excited to share our new work!

1. Not only GPT-4 but also other frontier LLMs have memorized the same set of NYT articles from the lawsuit.

2. Very large models, particularly with >100B parameters, have memorized significantly more.

🧵1/n

December 9, 2024 at 10:01 PM

Maksym Andriushchenko

@maksym-andr.bsky.social

📢 I'll be at NeurIPS 🇨🇦 from Tuesday to Sunday!

Let me know if you're also coming and want to meet. Would love to discuss anything related to AI safety/generalization.

Also, I'm on the academic job market, so would be happy to discuss that as well! My application package: andriushchenko.me.

🧵1/4

Maksym Andriushchenko

I'm a PhD student in computer science at EPFL advised by Nicolas Flammarion. I'm interested in understanding why machine learning works and why it fails.

andriushchenko.me

December 7, 2024 at 7:26 PM

Reposted by Maksym Andriushchenko

Marcel Salathé

@marcelsalathe.bsky.social

Mindblowing: EPFL PhD student @maksym-andr.bsky.social, winner of best CS thesis award, showed that leading hashtag#AI models are not robust to simple adaptive jailbreaking attacks. Indeed, he managed to jailbraik all models with a 100% success rate 🤯

Jailbraking paper: arxiv.org/abs/2404.02151

Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks

We show that even the most recent safety-aligned LLMs are not robust to simple adaptive jailbreaking attacks. First, we demonstrate how to successfully leverage access to logprobs for jailbreaking: we...

lnkd.in

December 6, 2024 at 7:00 AM

Maksym Andriushchenko

@maksym-andr.bsky.social

really feels like Twitter circa 2018. good old days... 😀

November 20, 2024 at 10:47 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news