Tom Dupuis
banner
tomdupuis.bsky.social
Tom Dupuis
@tomdupuis.bsky.social
PhD student @ENSTAParis🇫🇷, TD Learning and deep RLing, representation matters
MVA/CentraleSupélec alumni
Reposted by Tom Dupuis
we've used Atari games as an RL benchmark for so long, but for a little while it's bugged me that it's a discrete action problem, since the original joysticks were analog...
@jessefarebro.bsky.social & i fix this by introducing the Continuous ALE (CALE)!
read thread for details!
1/9
December 5, 2024 at 11:23 PM
Reposted by Tom Dupuis
Have you ever wondered how to train an autoregressive generative transformer on text and raw pixels, without a pretrained visual tokenizer (e.g. VQ-VAE)?

We have been pondering this during summer and developed a new model: JetFormer 🌊🤖

arxiv.org/abs/2411.19722

A thread 👇

1/
December 2, 2024 at 4:41 PM
Excellent blog post, summarizes quite well different recent findings in deep RL and where we are now
+ I fully agree with the conclusion: RL needs to become an *empirical science* and not a benchmark maximizing rat race where methods gets exponentially complex to squeeze out more reward points
Last year I gave a talk titled "From 'Bigger, Better, Faster' to 'Smaller, Sparser, Stranger'", which looked at the components that make up our BBF agent (arxiv.org/abs/2305.19452), highlighting some promising areas of research.

Finally in blog form, have a read!
psc-g.github.io/posts/resear...
November 28, 2024 at 2:14 AM
HF being portrayed as evil/bad was definitely not on my radar, what a shitshow
November 27, 2024 at 10:15 PM
Reposted by Tom Dupuis
I'll die on this hill: decision-making cognitive processes--artificial or not--are and will always be messy, and any subjective impression of formal exactitude is an illusionary narrative built afterwards.

Which doesn't mean the said narrative is not tremendously important.
November 27, 2024 at 4:48 AM
Reposted by Tom Dupuis
I'm disheartened by how toxic and violent some responses were here.

There was a mistake, a quick follow up to mitigate and an apology. I worked with Daniel for years and is one of the persons most preoccupied with ethical implications of AI. Some replies are Reddit-toxic level. We need empathy.
I've removed the Bluesky data from the repo. While I wanted to support tool development for the platform, I recognize this approach violated principles of transparency and consent in data collection. I apologize for this mistake.
First dataset for the new @huggingface.bsky.social @bsky.app community organisation: one-million-bluesky-posts 🦋

📊 1M public posts from Bluesky's firehose API
🔍 Includes text, metadata, and language predictions
🔬 Perfect to experiment with using ML for Bluesky 🤗

huggingface.co/datasets/blu...
November 27, 2024 at 11:09 AM
*taps the sign*
works for general robot policies too
November 27, 2024 at 2:51 PM
This is a striking example of the consequence of overselling in robot learning currently. Too many flashy results on internal demos, but since reproducible real benchmarking in robotics is close to impossible, most papers actually overfit to their setup (hardware + task), still in 2024
The current one at all the conferences and back channels is just "openvla and Octo don't actually work"
November 27, 2024 at 2:20 PM
Bought the M4 Mini for personal projects a few weeks ago on a small budget.
I'm still in shock of the performance/price ratio. It stays cool and dead silent as well. ARM arch ftw
Also is anyone using MLX a lot? Curious about its current state
November 26, 2024 at 1:44 PM
Bear with me: a prediction market on scientific research, by academics, for academics. Add credibility/confidence axis so that non-experts can still give opinions. Add discussion threads below the poll/voting metrics.

Ultra high quality vibe/consensus check on many science fields, centralized.
November 25, 2024 at 5:13 PM
Reposted by Tom Dupuis
I think when it comes to the field of Deep Learning/AI the bigger problem is the immense number of papers that don't replicate [1], don't transfer [2], are badly specified [3], or have other methodolical problems [4, 5, 6]
November 24, 2024 at 9:14 PM
November 25, 2024 at 2:36 PM
Soooo....... Rich Sutton's team just leapfrogged the whole field with this banger, solving RL with pure streaming data:
arxiv.org/abs/2410.14606
November 25, 2024 at 2:31 PM
1200+ papers in my Zotero....
November 23, 2024 at 1:58 AM
I'm getting 200% convinced that current roadblocks in RL and robot learning could be purely resolved by a breakthrough in continual learning.... Imagine being able to learn in real time from live video and sensorimotor data feeds at 50+ Hz. Sometines I think of this paper:
arxiv.org/abs/2312.00598
Learning from One Continuous Video Stream
We introduce a framework for online learning from a single continuous video stream -- the way people and animals learn, without mini-batches, data augmentation or shuffling. This poses great challenge...
arxiv.org
November 23, 2024 at 1:55 AM
The sweet Christmas-y feeling of waking up with snow outside, while the cluster rebooted in the night and all of the GPU nodes are suddenly free to grab ☃️🎁
November 21, 2024 at 10:47 AM
I was opening literally hundreds of papers from my reading list to sort them out and put them in my Zotero to pish the SoTA part of the manuscript and this happened. Feeling kinda proud tbh...
November 19, 2024 at 5:03 PM
BSKY academics, lets get to know each other! Quote this & tell me: 1) a project you are working on & 2) an odd idea/theory you aren’t working on but keep thinking about

1. Reward-less offline RL
2. Emergent vision from interaction (experience shapes perception, going beyond offline supervised/SSL)
BSKY academics, lets get to know each other! Quote this & tell me: 1) a project you are working on & 2) an odd idea/theory you aren’t working on but keep thinking about

1. High-dim planning
2. What are the minimal requirements for two agents to recognize each other as acting on behalf of reasons.
Bluesky academics, lets get to know each other! Quote this & tell me: 1) a project you are working on & 2) an odd idea/theory you aren’t working on but keep thinking about

1. I came to hate my work and thinking so don't do it anymore.
2.
November 18, 2024 at 11:31 PM
Reposted by Tom Dupuis
The RL (and some non-RL folks) starter pack is almost full. Pretty clear that the academic move here has succeeded
go.bsky.app/3WPHcHg
November 18, 2024 at 8:30 PM
Starting this account with a mysterious loss plot.
November 18, 2024 at 10:17 PM