Shashank Gupta
shashanknlp.bsky.social
Shashank Gupta
@shashanknlp.bsky.social
Researcher at @allen_ai (Ai2) || Research on NLP, LLMs, Reasoning, Agents, AI4Code, AI4Math || Prev: Microsoft AI, Univ. Of Illinois (UIUC), Max Planck (MPI), IIT-Bombay, BITS-Pilani

Web: https://shashankgupta.info/
Reposted by Shashank Gupta
Remember Molmo? The full recipe is finally out!

Training code, data, and everything you need to reproduce our models. Oh, and we have updated our tech report too!

Links in thread 👇
December 9, 2024 at 6:34 PM
Reposted by Shashank Gupta
Meet OLMo 2, the best fully open language model to date, including a family of 7B and 13B models trained up to 5T tokens. OLMo 2 outperforms other fully open models and competes with open-weight models like Llama 3.1 8B — As always, we released our data, code, recipes and more 🎁
November 26, 2024 at 8:51 PM
Reposted by Shashank Gupta
Meet Tülu 3, a set of state-of-the-art instruct models with fully open data, eval code, and training algorithms.
We invented new methods for fine-tuning language models with RL and built upon best practices to scale synthetic instruction and preference data.
Demo, GitHub, paper, and models 👇
November 21, 2024 at 5:15 PM