Lightnews — Scholar-powered news

Maksym Andriushchenko

@maksym-andr.bsky.social

370 followers 270 following 26 posts

Faculty at ‪the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems. Leading the AI Safety and Alignment group. PhD from EPFL supported by Google & OpenPhil PhD fellowships.

More details: https://www.andriushchenko.me/

Posts Replies Media Videos

Maksym Andriushchenko

@maksym-andr.bsky.social

🚨 Incredibly excited to share that I'm starting my research group focusing on AI safety and alignment at the ELLIS Institute Tübingen and Max Planck Institute for Intelligent Systems in September 2025! 🚨

1/n

August 6, 2025 at 3:43 PM

Maksym Andriushchenko

@maksym-andr.bsky.social

Main findings based on frontier LLMs:
- They directly comply with _many_ deliberate misuse queries
- They are relatively vulnerable even to _static_ prompt injections
- They occasionally perform unsafe actions

June 19, 2025 at 3:28 PM

Maksym Andriushchenko

@maksym-andr.bsky.social

🚨Excited to release OS-Harm! 🚨

The safety of computer use agents has been largely overlooked.

We created a new safety benchmark based on OSWorld for measuring 3 broad categories of harm:
1. deliberate user misuse,
2. prompt injections,
3. model misbehavior.

June 19, 2025 at 3:28 PM

Maksym Andriushchenko

@maksym-andr.bsky.social

🚨Excited to share our new work!

1. Not only GPT-4 but also other frontier LLMs have memorized the same set of NYT articles from the lawsuit.

2. Very large models, particularly with >100B parameters, have memorized significantly more.

🧵1/n

December 9, 2024 at 10:01 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news