Lightnews — Scholar-powered news

Martin Gubri

@mgubri.bsky.social

110 followers 420 following 37 posts

Research Lead @parameterlab.bsky.social working on Trustworthy AI
Speaking 🇫🇷, English and 🇨🇱 Spanish | Living in Tübingen 🇩🇪 | he/him

https://gubri.eu

Posts Replies Media Videos

Martin Gubri

@mgubri.bsky.social

BTW you might be interested in our TRAP paper (ACL Findings 2024), where we propose an intrinsic fingerprint method based on prompt optimization to find unique input-output pairs: bsky.app/profile/mgub...
4/4

Martin Gubri @mgubri.bsky.social · Nov 18

🌟 Pleased to join Bluesky! As a first post, allow me to share my latest first-author paper, TRAP 🪤, presented at #ACL24 (findings).

🦹💥 We explore how to detect if an LLM was stolen or leaked🤖💥
We showcase how to use adversarial prompt as #fingerprint for #LLM.
A thread 🧵
⬇️⬇️⬇️

October 31, 2025 at 6:31 PM

Martin Gubri

@mgubri.bsky.social

I think that the main difference is that model fingerprinting lets the verifier pick the inputs, while an output fingerprint would make any generated output identifiable. Always happy to exchange thoughts if you're interested :)
3/

October 31, 2025 at 6:31 PM

Martin Gubri

@mgubri.bsky.social

To me, your method looks like a new category: output fingerprint (something I've been thinking about for some time). Kind of like watermarking, where you have output (eg. red/green) and model (eg. instructional) watermarks.
2/

October 31, 2025 at 6:31 PM

Martin Gubri

@mgubri.bsky.social

Very nice paper, congrats! I really like it.

I have a few questions:
- Is the ellipse signature robust to noise added to the logits?
- can we compute the signature if we only have access to the top-k logits?

1/

October 31, 2025 at 6:31 PM

Martin Gubri

@mgubri.bsky.social

They found the universal intro for all papers:
<insert name> should be correct. But in reality, that is rarely true.

September 11, 2025 at 3:35 PM

Martin Gubri

@mgubri.bsky.social

Thanks a lot Guillaume :)

August 21, 2025 at 4:03 PM

Martin Gubri

@mgubri.bsky.social

I agree that there is a gap in the nb of params that a high-end device and a cheap one can run. I guess that "common consumer device" means a mid-range one. But I totally agree that they should specify the type of device: a mobile phone is quite different from a desktop computer.

July 22, 2025 at 9:35 AM

Martin Gubri

@mgubri.bsky.social

My pleasure! Yes, I guess so. I agree that a moving definition can be quite annoying for research. At the same time, I think it is not specific to LMs: a large file, an heavy software, etc. 15 years ago that required a lot of resources back then, is probably be quite small for today's hardware.

July 22, 2025 at 7:21 AM

Martin Gubri

@mgubri.bsky.social

There are more details in Appendix A.

July 21, 2025 at 10:27 PM

Martin Gubri

@mgubri.bsky.social

This NVIDIA position paper has a clear definition of an SLM: arxiv.org/abs/2506.02153
They consider <10B.
Personally, I would not consider 13B models to be SLMs (not even 7B). They require quite a lot of resources without using aggressive efficient inference techniques (like 4 bits quantization).

July 21, 2025 at 10:24 PM

Martin Gubri

@mgubri.bsky.social

This has been explored quite a lot for the task of jailbreaking an LLM (ie, adversarial examples against LLM alignment). For examples:
- arxiv.org/abs/2310.08419
- arxiv.org/abs/2312.02119
- arxiv.org/abs/2502.01633

July 16, 2025 at 7:12 PM

Martin Gubri

@mgubri.bsky.social

The authors show that LLMs often give opposite answers when forced to answer vs. when not (e.g., open-ended generation). And similarly, the conclusions are highly unstable with prompting.

April 24, 2025 at 3:35 PM

Martin Gubri

@mgubri.bsky.social

I agree with the need for transparency. Especially b/c the results seem highly dependent on the evaluation details. There is a really nice ACL 2024 paper about this: aclanthology.org/2024.acl-lon...

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Kirk, Hinrich Schuetze, Dirk Hovy. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Vol...

aclanthology.org

April 24, 2025 at 3:35 PM

Martin Gubri

@mgubri.bsky.social

Congrats Michael! 👏🎉
Will you stay in Paris?

November 22, 2024 at 7:54 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news