Kyle Lo
banner
kylelo.bsky.social
Kyle Lo
@kylelo.bsky.social
language model pretraining @ai2.bsky.social, co-lead of data research w/ @soldaini.net, statistics @uw, open science, tabletop, seattle, he/him,🧋 kyleclo.com
why intern at Ai2?

🐟interns own major parts of our model development, sometimes even leading whole projects
🐡we're committed to open science & actively help our interns publish their work

reach out if u wanna build open language models together 🤝

links 👇
November 5, 2025 at 11:11 PM
woah guess VLMs for OCR the hottest research topic this week😆 since the first olmOCR, we've been..

🔥training our VLM using RLVR with binary unit test rewards🔥

it's incredibly effective & unit test creation easy to scale w synthetic data pipelines

check it out at olmocr.allen.ai
October 22, 2025 at 6:02 PM
bye #colm2025 big fan of the montreal bagels 🥯 hot take I like them better than
October 11, 2025 at 6:16 PM
lol so much love for prepost-postpre training
October 9, 2025 at 5:13 PM
any other fans of pre-pretraining?
October 9, 2025 at 2:53 PM
come say hi at posters this morning for OLMo 2 and fluid benchmarking posters 👋 and dont miss @valentinhofmann.bsky.social's talk in morning #colm2025 @ai2.bsky.social vry proud of my gifs
October 9, 2025 at 1:14 PM
@josephc.bsky.social @mariaa.bsky.social and I are at poster #21

findings from large scale survey of 800 researchers on how they use LMs in their research #colm2025
October 8, 2025 at 8:12 PM
flyin to #colm2025 along w bunch of the @ai2.bsky.social team

come chat w me about pretraining horror stories, data & evals, what we're cookin for next olmo, etc

made a 🔥 poster for thursday sess, come say hi
October 6, 2025 at 3:20 PM
5 am airport for the only direct flight from seattle to montreal #colm2025
October 6, 2025 at 11:56 AM
LM benchmark design requires 3 decisions, how to:
🐟 select test cases
🐠 score LM on each test
🦈 aggregate scores to estimate perf

fluid benchmarking is simple:
🍣 find max informative test cases
🍥 estimate 'ability', not simple avg perf

why care? turn ur grey noisy benchmarks to red ones!
September 17, 2025 at 6:17 PM
looks like the preprint has been updated to include a disclaimer that this was a class project & intentionally provocatively written 😐
August 20, 2025 at 5:30 PM
⚠️ AI-generated content may be inaccurate. Verify important information independently.
August 8, 2025 at 8:33 PM
only took few days to descend into madness
July 1, 2025 at 8:12 PM
back from copenhagen & berkeley travels, now moving into new @ai2.bsky.social office!
June 26, 2025 at 3:45 PM
thx for organizing! great to meet NLP folks & consume fancy bread 🥖🍞🥐
June 21, 2025 at 2:32 PM
the benchmark works based on thousands of "unit tests"

so instead of fuzzy matching between a model-generated table with a gold reference table,

we define Pass/Fail tests like "the cell to the left of the cell containing 0.001 should contain 1.96"
June 19, 2025 at 1:25 PM
we won honorable mention for Best Paper at #CVPR2025 🏆 for Molmo & Pixmo, showing the value of high-quality data for VLMs!

recalling when we released same time as Llama 3.2 😆

huge kudos to Matt Deitke, Chris Clark & Ani Kembhavi for their leadership on this project!

@cvprconference.bsky.social
June 13, 2025 at 5:46 PM
google down, guess ill go smell flowers or sthn 🤷‍♂️
June 12, 2025 at 7:32 PM
excited to see this release of 1M public domain & CC zero books, digitized and OCR'd! 👏 big win for open data, congrats to the authors!

arxiv.org/abs/2506.08300
June 12, 2025 at 12:21 AM
looks like same group got an AI generated paper accepted to ACL 😅 www.intology.ai/blog/zochi-acl
May 29, 2025 at 12:18 AM
hilarious deep research UX w/ the agent trace

it's like "i found relevant content in <journal|conf|arxiv> paper" but the links provided all go to the publisher homepage instead of the actual paper lolol whyy 🤦‍♂️
May 23, 2025 at 7:03 AM
nice article thx for sharing! enjoyed fig about surveying bad baselines
May 23, 2025 at 6:12 AM
esp because of this
May 14, 2025 at 11:37 PM
unfortunately not 😮‍💨 it's disabled for me too; i am wondering if the call was incorrect - the real for making sure all openreview profiles exist was actually the abstract deadline, not full submission deadline. ill try emailing PCs
May 14, 2025 at 11:36 PM
@neuripsconf.bsky.social

it seems the call for papers neurips.cc/Conferences/... says author list should be finalized by May 15th, but on OpenReview itself, author list needs to be finalized by May 11th

can pls clarify, thx! 🙏
May 12, 2025 at 5:08 AM