pauljanson002.bsky.social
@pauljanson002.bsky.social
1/N🧵 Have you been using cosine decay to continually pre-train your foundation models? 💭 Excited to share our new paper, Beyond Cosine Decay, where we explore infinite LR schedulers ♾. Check it out! arxiv.org/abs/2503.02844. #MachineLearning #AI #optimization #continualAI
March 17, 2025 at 8:27 PM