Lightnews — Scholar-powered news

Daniil Tiapkin

@dtiapkin.bsky.social

920 followers 230 following 7 posts

PhD student at École polytechnique and Université Paris-Saclay 🇫🇷

Reinforcement Learning enjoyer, sometimes even with human feedback

Ex. student-researcher at Google DeepMind Paris

🌐 https://d-tiapkin.github.io/

Posts Replies Media Videos

Daniil Tiapkin

@dtiapkin.bsky.social

6/ Our paper is out: arxiv.org/abs/2502.02671. This work was the result of my internship at Google DeepMind—huge thanks to the team: Daniele Calandriello, Johan Ferret, Sarah Perrin, Nino Vieillard, @ramealexandre.bsky.social, @mblondel.bsky.social!

On Teacher Hacking in Language Model Distillation

Post-training of language models (LMs) increasingly relies on the following two stages: (i) knowledge distillation, where the LM is trained to imitate a larger teacher LM, and (ii) reinforcement learn...

arxiv.org

February 7, 2025 at 7:11 PM

Daniil Tiapkin

@dtiapkin.bsky.social

5/ Our suggestions are the following:
- Use online generations during distillation;
- Train on more diverse prompt datasets;
- Expand the dataset with multiple completions per prompt.

February 7, 2025 at 7:11 PM

Daniil Tiapkin

@dtiapkin.bsky.social

4/ The results? Teacher hacking is real: better approximating the teacher does not always translate into a better approximation of the oracle. Fortunately, we found some strategies to mitigate it.

February 7, 2025 at 7:11 PM

Daniil Tiapkin

@dtiapkin.bsky.social

3/ The key intuition is that distillation optimizes a proxy objective since the teacher isn’t perfect, exactly like RLHF optimizes an imperfect reward. To study this, we built a controlled setup in which an oracle model replaced the ground-truth objective.

February 7, 2025 at 7:11 PM

Daniil Tiapkin

@dtiapkin.bsky.social

2/ In our novel work from Google DeepMind, “On Teacher Hacking in Language Model Distillation,” we analyze this possible limitation, which would be critical if real, as distillation is becoming central for the post-training of modern LLMs.

February 7, 2025 at 7:11 PM

Daniil Tiapkin

@dtiapkin.bsky.social

Hope I'm not too late 😅

November 21, 2024 at 8:13 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news