Aneesh Sathe
banner
aneeshsathe.com
Aneesh Sathe
@aneeshsathe.com
aneeshsathe.com

🧪🧬💻🤖🔬🦠🩺💊🐍
Reposted by Aneesh Sathe
And new paper out: Pleias 1.0: the First Family of Language Models Trained on Fully Open Data

How we train an open everything model on a new pretraining environment with releasable data (Common Corpus) with an open source framework (Nanotron from HuggingFace).

www.sciencedirect.com/science/arti...
September 28, 2025 at 5:13 AM
Reposted by Aneesh Sathe
and it works!

they made a tiny 8B model that holds up well against many large MoE models
September 9, 2025 at 3:47 PM
Thank you! The open data training is an important milestone
September 29, 2025 at 4:00 AM
this collage just generates happiness
September 28, 2025 at 12:19 AM
Farm me
September 8, 2025 at 11:57 PM