youtu.be/1efVS4DeEOs
Links:
- Paper: arxiv.org/abs/2412.16339
- Blog: openai.com/index/delibe...
youtu.be/1efVS4DeEOs
Links:
- Paper: arxiv.org/abs/2412.16339
- Blog: openai.com/index/delibe...
Thank you to everyone I got to talk to, especially at the poster sessions
And thanks to the organizers for picking a beautiful location (the video is from a nearby hike with Vikrant)
www.youtube.com/watch?v=MBGI...
Thank you to everyone I got to talk to, especially at the poster sessions
And thanks to the organizers for picking a beautiful location (the video is from a nearby hike with Vikrant)
www.youtube.com/watch?v=MBGI...
NeurIPS 2024 poster presentation
By @vishaalurao.bsky.social
youtu.be/YNZ23YPasXo
NeurIPS 2024 poster presentation
By @vishaalurao.bsky.social
youtu.be/YNZ23YPasXo
DM if you're interested in meeting up for a chat (or a jog).
DM if you're interested in meeting up for a chat (or a jog).
arxiv.org/abs/2411.18674
Smol models are all the rage these days & knowledge distillation (KD) is key for model compression!
We show how data curation can effectively distill to yield SoTA FLOP-efficient {C/Sig}LIPs!!
🧵👇
arxiv.org/abs/2411.18674
Smol models are all the rage these days & knowledge distillation (KD) is key for model compression!
We show how data curation can effectively distill to yield SoTA FLOP-efficient {C/Sig}LIPs!!
🧵👇
The first open-source world-wide training of a 10B model. The underlying ML distributed algo is DiLoCo (arxiv.org/abs/2311.08105) but they also built tons of engineering on top of it to make it scalable.
The first open-source world-wide training of a 10B model. The underlying ML distributed algo is DiLoCo (arxiv.org/abs/2311.08105) but they also built tons of engineering on top of it to make it scalable.
LLMs are closing the gap to humans
Details: metr.org/AI_R_D_Evalu...
LLMs are closing the gap to humans
Details: metr.org/AI_R_D_Evalu...