Caio
caiocorro.bsky.social
Caio
@caiocorro.bsky.social
NLP researcher
Reposted by Caio
What is wild to me is the defense, BY THE NEURIPS BOARD, that fabricated citations do not mean "the content of the papers themselves [is] necessarily invalidated"

It does. It very much does. What do you think citing other work is for? What do you think writing a paper is for? What do you *think*?
January 21, 2026 at 9:34 PM
Reposted by Caio
JO du jour : ouverture des concours chercheurs INRIA : 6 postes de CRCN, 14 postes de DR2.
www.legifrance.gouv.fr/jorf/id/JORF...
www.legifrance.gouv.fr/jorf/id/JORF...
www.legifrance.gouv.fr
January 20, 2026 at 8:04 AM
New paper accepted for publication @ EACL 2026 : we introduce knapsack approximation deferral (KAD), a framework to build mixture distributions for proxy-based text-time alignement of large language models. Preprint and code are already online! :)
January 8, 2026 at 9:56 AM
Je pense que le pire c'est les systèmes de paiements bancaires en ligne sur lesquels il fait taper le mot de passe de son compte bancaire (la poste par exemple) alors qu'on est pas sur URL liée à la banque en question.
Le problème avec le fishing c'est que les vrais organismes sont parfois tellement mauvais que leurs vrais mails ressemblent à du fishing. Aujourd'hui la caisse des écoles qui "désactive mon compte" sans raison apparente et me demande de cliquer sur un lien suspect
January 6, 2026 at 4:47 PM
Reposted by Caio
Just in case anyone thought buffer overflows and insecure C code were a recent phenomenon, there is a buffer overflow in the code of the “su” program from Unix v4 (circa 1973, probably written by Ken Thompson): sigma-star.at/blog/2025/12...
Fixing a Buffer Overflow in UNIX v4 Like It's 1973
This blog post shows how to fix a buffer overflow in the su progam of UNIX v4
sigma-star.at
January 1, 2026 at 11:58 AM
Reposted by Caio
I am not much into the prompt attacks for LLMs, however, this paper has a nice formalism to descritbe that
"Attacker LLM"

arxiv.org/abs/2512.20806
Safety Alignment of LMs via Non-cooperative Games
Ensuring the safety of language models (LMs) while maintaining their usefulness remains a critical challenge in AI alignment. Current approaches rely on sequential adversarial training: generating adv...
arxiv.org
December 30, 2025 at 12:16 AM
Reposted by Caio
That is pretty obvious to me. If a paper has hallucinations, fake citations or other LLM artifacts, it should be immediately desk rejected.
I'd like to propose the following norm for peer review of papers. If a paper shows clear signs of LLM-generated errors that were not detected by the author, the paper should be immediately rejected. My reasoning: 1/ #ResearchIntegrity
December 28, 2025 at 10:04 AM
Reposted by Caio
Quand l’IA pollue les forums de maths
Quand l’IA pollue les forums de maths
Les sites spécialisés sont confrontés à un afflux de « contributions » nourries par des intelligences artificielles.
www.lemonde.fr
December 25, 2025 at 7:06 AM
Reposted by Caio
COLM 2026 is just around the corner! Mark your calendars for:

💡 Abstract deadline: Thursday, March 26, 2026
📄 Full paper submission deadline: Tuesday, March 31, 2026

Call for papers (website coming soon):
docs.google.com/document/d/1...
December 16, 2025 at 3:31 PM
Reposted by Caio
It would be interesting to know how many papers are concerned (desk rejected).
December 4, 2025 at 12:37 AM
Reposted by Caio
📢 Statement from ACL and EACL 2026 Organizers

On Nov 27, OpenReview was notified of a software bug that allowed unauthorized access to authors, reviewers, and area chairs. We are grateful to the OpenReview team for fixing the issue quickly. (🧵 1/3)
openreview.net
November 29, 2025 at 9:29 AM
Reposted by Caio
Statement by OpenReview on X
November 28, 2025 at 12:04 AM
Reposted by Caio
[ #VeilleESR #Parcoursup ] Il vient de se passer un truc extraordinaire : le gouvernement a atteint un de ses objectifs.

Celui de baisser la part des néobachliers recevant une proposition dans Parcoursup.

C'est sans doute une date historique, le début de démassification éducative, .
🧵
November 15, 2025 at 4:44 PM
Reposted by Caio
Cool seminar coming up at @inriaparisnlp.bsky.social. If you guys can't make it on site, a visio link will be provided (check out the seminar webpage 30mn before the talk)
We are excited to announce our next seminar by Fabian Suchanek (Télécom Paris, Institut Polytechnique de Paris) "On Language Models and Knowledge Bases" on Friday 21st November, 11am CET. Details can be found here: almanach.inria.fr/seminars-en....
November 9, 2025 at 5:26 PM
One of the hardest Pytorch bug I had to debug is due to how the logsumexp behave with -inf masked inputs. Consider the following example. I build a vector of 3 logits, and each logit is the result of a logsumexp.
November 4, 2025 at 9:12 AM
Reposted by Caio
Hallucinant. C'est inacceptable. J'ai une ERC, que rétrospectivement j'ai eu la chance de rédiger à l'étranger avec un dispositif bien meilleur que ceux qui sont proposés en France et SURTOUT avec de quoi produire des résultats préliminaires que ce soit en terme de mentiring ou de moyens #esr
Quand notre ministre de tutelle nous insulte devant la représentation nationale. "Bande de nuls" "complètement à la ramasse".
Nous reprocher des taux de réussite faible à Horizon Europe et ERC, quand manquent les moyens pour assurer nos missions de service public. Surtout changez rien!👌
October 31, 2025 at 10:29 AM
Reposted by Caio
Universities across the world seeing this:

"its only wrong 45% of the time!!
Lets buy free licenses for our students, staff and faculty!!
Lets lock into contracts with rapacious predatory AI companies with shitty technofascist politics, sucking up water and jacking up electricity prices!!"
October 23, 2025 at 1:19 PM
Reposted by Caio
Not all scaling laws are nice power laws. This month’s blog post: Zipf’s law in next-token prediction and why Adam (ok, sign descent) scales better to large vocab sizes than gradient descent: francisbach.com/scaling-laws...
September 27, 2025 at 2:57 PM
Reposted by Caio
~9 months ago I spent some time with Luca Soldaini making a list of models and resources for language models that were more than just open weights (data, code, logs, etc included). It's getting out of date, could use some community contributions :)
September 14, 2025 at 5:05 PM
Reposted by Caio
For updates on AI, I increasingly just advise people to pick a discord they like and stick with it. Twitter stopped having interesting science chat ages ago, it’s just companies announcing their products and getting a bunch of meme QTs.
August 23, 2025 at 2:29 PM
Reposted by Caio
This year, EMNLP ended up desk rejecting ~100 papers. For more insight into the process, and potential future changes, please see this blog post from the PCs: 2025.emnlp.org/desk-rejecti...

@christos-c.bsky.social @carolynrose.bsky.social @tanmoy-chak.bsky.social @violetpeng.bsky.social
New Desk Rejection Practice for EMNLP 2025
For some time there has been substantial concern within the community regarding many aspects of reviewing, from poor quality, to too few reviewers in the pool, to poor quality reviews, to reviewers no...
2025.emnlp.org
August 20, 2025 at 4:22 PM
Reposted by Caio
My paper "Tokenization as Finite-State Transduction" was accepted to Computational Linguistics.

This was my final PhD degree requirement :)

The goal was to unify the major tokenization algorithms under a finite-state automaton framework. For example, by encoding a BPE tokenizer as a transducer.
August 15, 2025 at 7:25 AM
Reposted by Caio
#ClubContexte L'École Normale Supérieure de Paris-Saclay (à ne pas confondre avec l'École Normale Supérieure, ou «Ulm», qui est à Paris et pas à Ulm) est l'ancienne École Normale Supérieure de Cachan qui a déménagé: elle n'est située ni à Paris, ni à Saclay, mais à Gif-sur-Yvette …
August 14, 2025 at 9:18 PM
Reposted by Caio
Un like = une phrase réaliste dite par CHAT GPT 5 s'il s'exprimait vraiment comme quelqu'un avec un doctorat.
August 9, 2025 at 8:52 AM
Reposted by Caio
What a fantastic accomplishment -- and what a fantastic story! www.quantamagazine.org/at-17-hannah...
At 17, Hannah Cairo Solved a Major Math Mystery | Quanta Magazine
After finding the homeschooling life confining, the teen petitioned her way into a graduate class at Berkeley, where she ended up disproving a 40-year-old conjecture.
www.quantamagazine.org
August 3, 2025 at 12:13 PM