Luca Soldaini 🎀
banner
soldaini.net
Luca Soldaini 🎀
@soldaini.net
I like tokens! Lead for OLMo data at @ai2.bsky.social (Dolma 🍇) w @kylelo.bsky.social. Open source is fun 🤖☕️🍕🏳️‍🌈 Opinions are sampled from my own stochastic parrot

more at https://soldaini.net
yea i was gonna link 🤣 rough guidelines I’ve heard for multilingual are around 600B+, which high level matches yuval’s findings.
October 21, 2025 at 5:50 PM
babyyyyy
June 9, 2025 at 5:31 AM
text classification at scale, works great on 70TB of text
June 9, 2025 at 5:30 AM
scales just fine to 70TB of text, supports subword embedding, someone made rust bindings 😌
June 9, 2025 at 5:29 AM
no reason to switch just because the software is no longer updated. compile from scratch, works great!
June 9, 2025 at 5:28 AM
congratulations!!
June 6, 2025 at 2:11 AM
Reddit also has deals with OpenAI and GDM. Maybe negotiation stalled with Anthropic.
June 6, 2025 at 2:10 AM
they are a joy to type with our loud mechanical keyboards
June 3, 2025 at 6:52 PM
if soldering skills become critical i’m gonna be soon out of a job 😅
May 17, 2025 at 3:50 PM
MANGO SMOOTHIE

don’t forget da smoothie 🤤
April 23, 2025 at 7:58 AM
congrats!!! amazing news 🥰
March 27, 2025 at 3:29 AM