Siddarth Venkatraman @ NeurIPS 2024
hyperpotatoneo.bsky.social
Siddarth Venkatraman @ NeurIPS 2024
@hyperpotatoneo.bsky.social
PhD student at Mila | Diffusion models and reinforcement learning 🧐 | hyperpotatoneo.github.io
Reposted by Siddarth Venkatraman @ NeurIPS 2024
Few fields reward quick pivoting as much as AI, or vice versa punish the very thing a phd is usually meant to be: stick with one research direction for 5 years no matter what, go really deep, becoming a niche expert

for your research to be relevant in AI, you might wanna pivot every 1-2 years
December 22, 2024 at 6:11 AM
Saying o3 is just a “more principled search technique” is quite reductive. The o series of models don’t require “explicit search” strategies in the form of tree search, wrapped in loops etc. Instead, RL is used to train the model to “learn to search” using long CoT chains.
Six months ago someone put a for-loop around GPT-4o and got 50% on the ARC-AGI test set and 72% on a held-out training set redwoodresearch.substack.com/p/getting-50... Just sample 8000 times with beam search.

o3 is probably a more principled search technique...
Getting 50% (SoTA) on ARC-AGI with GPT-4o
You can just draw more samples
redwoodresearch.substack.com
December 22, 2024 at 4:07 AM
Reposted by Siddarth Venkatraman @ NeurIPS 2024
Come check out our neurips poster today! We will be at West Ballroom #7101 from 4:30pm - 7:30pm.

Website: github.com/gfnorg/diffu...
GitHub - GFNOrg/diffusion-samplers
Contribute to GFNOrg/diffusion-samplers development by creating an account on GitHub.
github.com
December 12, 2024 at 8:51 PM
Reposted by Siddarth Venkatraman @ NeurIPS 2024
If you're at NeurIPS, RLC is hosting an RL event from 8 till late at The Pearl on Dec. 11th. Join us, meet all the RL researchers, and spread the word!
December 10, 2024 at 9:55 PM
www.newsweek.com/united-healt...

I have anecdotal evidence from a friend who works at a client company for a popular insurance firm. They are using shitty “AI models” which are basically just CatBoost to mass process claims. They know the models are shit, but that’s also the point. Truly sickening.
A year before CEO shooting, lawsuit alleged UHC used AI to deny coverage
The lawsuit accuses UnitedHealthcare of using artificial intelligence to deny coverage to elderly patients.
www.newsweek.com
December 6, 2024 at 9:01 AM
This app is an interesting social experiment. Assuming Bluesky doesn’t just fizzle out, will hostile social relations as in Twitter resurface here too? If hostilities do return, will it be because conservatives come to this app, or will it be new political tensions within left leaning communities?
December 1, 2024 at 4:23 AM
As AI researchers, we shouldn’t demonize people outside our space who have a passionate distaste for AI. You have to understand that most of the pro-AI sentiment people see online comes from absolutely vile “AI-bros”, especially on twitter. We just need to distinguish ourselves as academics.
November 30, 2024 at 2:03 PM
Anyone has thoughts about which generative models are also the best for representation learning features for downstream tasks?

My guess is GANs are a dark horse and the latents carry important abstract features. But we haven’t explored this much since they are hard to train.
November 29, 2024 at 4:02 AM
Two predictions:
1) RNNs will be back for long sequence modeling (the latent bottleneck is long term memory), but attention will be used for local short term context
2) Hierarchical VAEs will be back (with trainable encoders, not like diffusion models)
November 27, 2024 at 3:32 AM
The new account feed here is so much nicer than “The everything app” 🙂
Hope the hype doesn’t fizzle out turning this app into another “Threads”
November 24, 2024 at 10:15 PM