James Michaelov
jamichaelov.bsky.social
James Michaelov
@jamichaelov.bsky.social
Postdoc at MIT. Research: language, the brain, NLP.

jmichaelov.com
Excited to announce that I’ll be presenting a paper at #NeurIPS this year! Reach out if you’re interested in chatting about LM training dynamics, architectural differences, shortcuts/heuristics, or anything at the CogSci/NLP/AI interface in general! #Neurips2025
November 25, 2025 at 2:27 PM
In the most extreme case, LMs assign sentences such as ‘the car was given a parking ticket by the explorer’ (unlikely but possible event) a lower probability than ‘the car was given a parking ticket by the brake’ (animacy-violating event, semantically-related final word) over half of the time. 2/3
June 12, 2025 at 5:54 PM
New paper accepted at ACL Findings! TL;DR: While language models generally predict sentences describing possible events to have a higher probability than impossible (animacy-violating) ones, this is not robust for generally unlikely events and is impacted by semantic relatedness. 1/3
June 12, 2025 at 5:54 PM
But we also predict words that are RELATED to the word “full” such as “half” more than UNRELATED words like “mild”. Again, language models also show this effect:
April 2, 2024 at 7:22 PM
But we are also more likely to predict a word RELATED to the described event (mountain biking) like “dirt” than an UNRELATED word like “table”, even though neither makes sense in context. We see the same effect in language models (a lower surprisal indicates a stronger prediction):
April 2, 2024 at 7:22 PM
Looking forward to the final day of EMMLP! Let me know if you want to chat about our Findings paper: “Emergent inabilities? Inverse scaling over the course of pretraining” arxiv.org/abs/2305.14681 #EMNLP #EMNLP2023
December 10, 2023 at 1:07 AM