Ivan Rubachev
puhsu.bsky.social
Ivan Rubachev
@puhsu.bsky.social
ML Researcher at research.yandex.com | Working on DL for Tabular Data
Explicitly adding induction heads helps. Some gains in NLP, seemingly bigger in RL algorithm distillation arxiv.org/abs/2411.01958
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs
In-context learning allows models like transformers to adapt to new tasks from a few examples without updating their weights, a desirable trait for reinforcement learning (RL). However, existing in-co...
arxiv.org
December 4, 2024 at 5:58 PM
⚡️
December 3, 2024 at 12:51 PM
Yep, just need to find the code. I can share
November 29, 2024 at 8:23 AM
Yeah. I've experimented a bit with the existing code. It generalized to some of our specific problems in tabular DL (even though the meta-train was mostly from language and vision tasks). Curious what do you mean by actually worked here? No edge cases and failures, or just easy to use technically?
November 29, 2024 at 7:37 AM
Reposted by Ivan Rubachev
The rejects were horribly misinformed self contradictory but extremely confident. PSGD, SOAP and friends are taking over regardless of academia.
November 28, 2024 at 8:17 PM
November 26, 2024 at 6:42 PM
But keep the numbers in appendix or code pls

So annoying when the only info is in visual form with unclear axes etc. I agree that it’s much better for presentation, but when digging in, I often need raw metrics.
November 24, 2024 at 9:10 AM
…extend of customisability?

If I understand correctly, we can do a lot with custom feeds.

Some examples here github.com/Bossett/bsky...
GitHub - Bossett/bsky-feeds
Contribute to Bossett/bsky-feeds development by creating an account on GitHub.
github.com
November 18, 2024 at 8:19 PM
Wow. Didn’t know we can create custom algorithmic feeds here. This is cool! What are your favourites, what’s the extend of

(context: docs.bsky.app/docs/starter...)
Custom Feeds | Bluesky
Custom feeds, or feed generators, are services that provide custom algorithms to users through the AT Protocol. This allows users to choose their own timelines, whether it's an algorithmic For You pag...
docs.bsky.app
November 18, 2024 at 8:17 PM