Robin Jia
banner
robinjia.bsky.social
Robin Jia
@robinjia.bsky.social
Assistant Professor in Computer Science at USC | NLP, ML
Hubble is finally out! We used 200k GPU hours from NAIRR and NVIDIA to build a comprehensive resource for the scientific study of LLM memorization. Fully open-source models & data up to 8B params + 500B tokens with controlled data insertion to study memorization risks 🔭✨
Announcing 🔭Hubble, a suite of open-source LLMs to advance the study of memorization!

Pretrained 1B/8B param models, with controlled insertion of texts designed to emulate key memorization risks: copyright (e.g., book passages), privacy (e.g., synthetic biographies), and test set contamination
October 24, 2025 at 6:36 PM
Reposted by Robin Jia
Announcing 🔭Hubble, a suite of open-source LLMs to advance the study of memorization!

Pretrained 1B/8B param models, with controlled insertion of texts designed to emulate key memorization risks: copyright (e.g., book passages), privacy (e.g., synthetic biographies), and test set contamination
October 24, 2025 at 6:21 PM
Reposted by Robin Jia
I had a lot of fun contemplating about memorization questions at the @l2m2workshop.bsky.social panel yesterday together with Niloofar Mireshghallah and Reza Shokri, moderated by
@pietrolesci.bsky.social who did a fantastic job!
#ACL2025
August 2, 2025 at 3:04 PM
Automatic metrics for assessing factuality are easy to run and commonly used, but do they work? In < 1 hour, come find the answer at poster 349 in Hall X4, where I’ll be presenting @ameyagodbole.bsky.social ‘s work uncovering inconsistencies, errors, and biases of factuality metrics!
July 30, 2025 at 8:15 AM
I’ll be at ACL 2025 next week where my group has papers on evaluating evaluation metrics, watermarking training data, and mechanistic interpretability. I’ll also be co-organizing the first Workshop on LLM Memorization @l2m2workshop.bsky.social on Friday. Hope to see lots of folks there!
July 25, 2025 at 4:36 PM
Reposted by Robin Jia
Come by @naaclmeeting.bsky.social Poster 6 in Hall 3 from 4-530pm today to see @billzhu.bsky.social's and Ishika Singh's work with me and @robinjia.bsky.social on PSALM: autonomously inducing symbolic pre- and post-conditions of actions with LLMs, symbolic planning, and text environment interaction!
May 1, 2025 at 5:39 PM
Check out @billzhu.bsky.social ‘s excellent work on combining LLMs with symbolic planners at NAACL on Thursday! I will also be at NAACL Friday-Sunday, looking forward to chatting about LLM memorization, interpretability, evaluation, and more
At @naaclmeeting.bsky.social this week! I’ll be presenting our work on LLM domain induction with @thomason.bsky.social on Thu (5/1) at 4pm in Hall 3, Section I.

Would love to connect and chat about LLM planning, reasoning, AI4Science, multimodal stuff, or anything else. Feel free to DM!
April 30, 2025 at 7:46 PM
Reposted by Robin Jia
At @naaclmeeting.bsky.social this week! I’ll be presenting our work on LLM domain induction with @thomason.bsky.social on Thu (5/1) at 4pm in Hall 3, Section I.

Would love to connect and chat about LLM planning, reasoning, AI4Science, multimodal stuff, or anything else. Feel free to DM!
April 30, 2025 at 6:38 PM
Reposted by Robin Jia
Excited to share that my intern work at Meta GenAI is accepted to @iclr-conf.bsky.social #ICLR2025

Introducing TLDR: Token-Level Detective Reward Model For Large Vision Language Models.

TLDR provides fine-grained annotations to
each text token.

🔗arXiv: arxiv.org/abs/2410.04734
February 8, 2025 at 5:29 AM
Our workshop on LLM Memorization is coming to ACL 2025! The call for papers is out, please submit both archival and non-archival (work in progress or already published) papers
📢 The First Workshop on Large Language Model Memorization (L2M2) will be co-located with
@aclmeeting.bsky.social in Vienna 🎉

💡 L2M2 brings together researchers to explore memorization from multiple angles. Whether it's text-only LLMs or Vision-language models, we want to hear from you! 🌍
January 27, 2025 at 11:23 PM
I'll be at #NeurIPS2024! My group has papers analyzing how LLMs use Fourier Features for arithmetic and how TFs learn higher-order optimization for ICL (led by @deqing.bsky.social), plus workshop papers on backdoor detection and LLMs + PDDL (led by @billzhu.bsky.social)
December 9, 2024 at 10:21 PM
Reposted by Robin Jia
A starter pack for #NLP #NLProc researchers! 🎉

go.bsky.app/SngwGeS
November 4, 2024 at 10:01 AM
Reposted by Robin Jia
USC NLP folks are on Bluesky!
Follow my amazing colleagues here

go.bsky.app/KUwSZ6W
November 12, 2024 at 5:44 PM
Reposted by Robin Jia
Started a SoCal AI/ML/NLP researchers starter pack! It's a bit sparse right now, and perhaps more NLP heavy, but hey, nominate yourself and others! go.bsky.app/6QckPj9
November 19, 2024 at 3:28 PM