Lightnews — Scholar-powered news

Kyle Lo

@kylelo.bsky.social

6.6K followers 590 following 510 posts

language model pretraining @ai2.bsky.social, co-lead of data research w/ @soldaini.net, statistics @uw, open science, tabletop, seattle, he/him,🧋 kyleclo.com

Posts Replies Media Videos

Kyle Lo

@kylelo.bsky.social

why intern at Ai2?

🐟interns own major parts of our model development, sometimes even leading whole projects
🐡we're committed to open science & actively help our interns publish their work

reach out if u wanna build open language models together 🤝

links 👇

November 5, 2025 at 11:11 PM

Kyle Lo

@kylelo.bsky.social

woah guess VLMs for OCR the hottest research topic this week😆 since the first olmOCR, we've been..

🔥training our VLM using RLVR with binary unit test rewards🔥

it's incredibly effective & unit test creation easy to scale w synthetic data pipelines

check it out at olmocr.allen.ai

October 22, 2025 at 6:02 PM

Kyle Lo

@kylelo.bsky.social

bye #colm2025 big fan of the montreal bagels 🥯 hot take I like them better than

October 11, 2025 at 6:16 PM

Kyle Lo

@kylelo.bsky.social

lol so much love for prepost-postpre training

October 9, 2025 at 5:13 PM

Kyle Lo

@kylelo.bsky.social

any other fans of pre-pretraining?

October 9, 2025 at 2:53 PM

Kyle Lo

@kylelo.bsky.social

come say hi at posters this morning for OLMo 2 and fluid benchmarking posters 👋 and dont miss @valentinhofmann.bsky.social's talk in morning #colm2025 @ai2.bsky.social vry proud of my gifs

October 9, 2025 at 1:14 PM

Kyle Lo

@kylelo.bsky.social

@josephc.bsky.social @mariaa.bsky.social and I are at poster #21

findings from large scale survey of 800 researchers on how they use LMs in their research #colm2025

October 8, 2025 at 8:12 PM

Kyle Lo

@kylelo.bsky.social

flyin to #colm2025 along w bunch of the @ai2.bsky.social team

come chat w me about pretraining horror stories, data & evals, what we're cookin for next olmo, etc

made a 🔥 poster for thursday sess, come say hi

October 6, 2025 at 3:20 PM

Kyle Lo

@kylelo.bsky.social

5 am airport for the only direct flight from seattle to montreal #colm2025

October 6, 2025 at 11:56 AM

Kyle Lo

@kylelo.bsky.social

LM benchmark design requires 3 decisions, how to:
🐟 select test cases
🐠 score LM on each test
🦈 aggregate scores to estimate perf

fluid benchmarking is simple:
🍣 find max informative test cases
🍥 estimate 'ability', not simple avg perf

why care? turn ur grey noisy benchmarks to red ones!

September 17, 2025 at 6:17 PM

Kyle Lo

@kylelo.bsky.social

looks like the preprint has been updated to include a disclaimer that this was a class project & intentionally provocatively written 😐

August 20, 2025 at 5:30 PM

Kyle Lo

@kylelo.bsky.social

⚠️ AI-generated content may be inaccurate. Verify important information independently.

August 8, 2025 at 8:33 PM

Kyle Lo

@kylelo.bsky.social

only took few days to descend into madness

July 1, 2025 at 8:12 PM

Kyle Lo

@kylelo.bsky.social

back from copenhagen & berkeley travels, now moving into new @ai2.bsky.social office!

June 26, 2025 at 3:45 PM

Kyle Lo

@kylelo.bsky.social

thx for organizing! great to meet NLP folks & consume fancy bread 🥖🍞🥐

June 21, 2025 at 2:32 PM

Kyle Lo

@kylelo.bsky.social

the benchmark works based on thousands of "unit tests"

so instead of fuzzy matching between a model-generated table with a gold reference table,

we define Pass/Fail tests like "the cell to the left of the cell containing 0.001 should contain 1.96"

June 19, 2025 at 1:25 PM

Kyle Lo

@kylelo.bsky.social

we won honorable mention for Best Paper at #CVPR2025 🏆 for Molmo & Pixmo, showing the value of high-quality data for VLMs!

recalling when we released same time as Llama 3.2 😆

huge kudos to Matt Deitke, Chris Clark & Ani Kembhavi for their leadership on this project!

@cvprconference.bsky.social

June 13, 2025 at 5:46 PM

Kyle Lo

@kylelo.bsky.social

google down, guess ill go smell flowers or sthn 🤷‍♂️

June 12, 2025 at 7:32 PM

Kyle Lo

@kylelo.bsky.social

excited to see this release of 1M public domain & CC zero books, digitized and OCR'd! 👏 big win for open data, congrats to the authors!

arxiv.org/abs/2506.08300

June 12, 2025 at 12:21 AM

Kyle Lo

@kylelo.bsky.social

looks like same group got an AI generated paper accepted to ACL 😅 www.intology.ai/blog/zochi-acl

May 29, 2025 at 12:18 AM

Kyle Lo

@kylelo.bsky.social

hilarious deep research UX w/ the agent trace

it's like "i found relevant content in <journal|conf|arxiv> paper" but the links provided all go to the publisher homepage instead of the actual paper lolol whyy 🤦‍♂️

May 23, 2025 at 7:03 AM

Kyle Lo

@kylelo.bsky.social

nice article thx for sharing! enjoyed fig about surveying bad baselines

May 23, 2025 at 6:12 AM

Kyle Lo

@kylelo.bsky.social

esp because of this

May 14, 2025 at 11:37 PM

Kyle Lo

@kylelo.bsky.social

unfortunately not 😮‍💨 it's disabled for me too; i am wondering if the call was incorrect - the real for making sure all openreview profiles exist was actually the abstract deadline, not full submission deadline. ill try emailing PCs

May 14, 2025 at 11:36 PM

Kyle Lo

@kylelo.bsky.social

@neuripsconf.bsky.social

it seems the call for papers neurips.cc/Conferences/... says author list should be finalized by May 15th, but on OpenReview itself, author list needs to be finalized by May 11th

can pls clarify, thx! 🙏

May 12, 2025 at 5:08 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news