https://araffin.github.io/
Types of Reinforcement Learning Paper
Original image: @xkcd.com
Workshop on Reinforcement Learning 2026, taking place on 𝐅𝐞𝐛𝐫𝐮𝐚𝐫𝐲 𝟔, 𝟐𝟎𝟐𝟔, at the 𝐔𝐧𝐢𝐯𝐞𝐫𝐬𝐢𝐭𝐲 𝐨𝐟 𝐌𝐚𝐧𝐧𝐡𝐞𝐢𝐦, Germany.
Participation in the workshop is 𝐟𝐫𝐞𝐞 𝐨𝐟 𝐜𝐡𝐚𝐫𝐠𝐞!
Check the program and register: www.wim.uni-mannheim.de/doering/conf...
Workshop on Reinforcement Learning 2026, taking place on 𝐅𝐞𝐛𝐫𝐮𝐚𝐫𝐲 𝟔, 𝟐𝟎𝟐𝟔, at the 𝐔𝐧𝐢𝐯𝐞𝐫𝐬𝐢𝐭𝐲 𝐨𝐟 𝐌𝐚𝐧𝐧𝐡𝐞𝐢𝐦, Germany.
Participation in the workshop is 𝐟𝐫𝐞𝐞 𝐨𝐟 𝐜𝐡𝐚𝐫𝐠𝐞!
Check the program and register: www.wim.uni-mannheim.de/doering/conf...
Thanks to Paul Vicol (@paulvicol.bsky.social) for his tireless work on this new option, as well as the OpenReview team.
🎬 This is a new, HTML-based submission format for TMLR, that supports interactive figures and videos, along with the usual LaTeX and images.
🎉 Thanks to TMLR Editors in Chief: Hugo Larochelle, @gautamkamath.com, Naila Murray, Nihar B. Shah, and Laurent Charlin!
Thanks to Paul Vicol (@paulvicol.bsky.social) for his tireless work on this new option, as well as the OpenReview team.
This may be a warning to lots of humanoids companies. All your promises don’t matter to the public if your robot looks or acts dumb.
youtu.be/b_SNExtznd4?...
This may be a warning to lots of humanoids companies. All your promises don’t matter to the public if your robot looks or acts dumb.
youtu.be/b_SNExtznd4?...
michaelbastos.com/blog/why-sel...
#programming #softwaredevelopment #tech #blog
michaelbastos.com/blog/why-sel...
#programming #softwaredevelopment #tech #blog
Lots of progress in RL research over last 10 years, but too much performance-driven => overfitting to benchmarks (like the ALE).
1⃣ Let's advance science of RL
2⃣ Let's be explicit about how benchmarks map to formalism
1/X
Lots of progress in RL research over last 10 years, but too much performance-driven => overfitting to benchmarks (like the ALE).
1⃣ Let's advance science of RL
2⃣ Let's be explicit about how benchmarks map to formalism
1/X
Modern package management for Robotics with Pixi!
prefix.dev/blog/reprod...
#ROS #ROSCon #ROSCon2025
Modern package management for Robotics with Pixi!
prefix.dev/blog/reprod...
#ROS #ROSCon #ROSCon2025
link: www.tylervigen.com/spurious-cor...
found via @stefanjudis.com newsletter
link: www.tylervigen.com/spurious-cor...
found via @stefanjudis.com newsletter
Day 1: www.youtube.com/watch?v=Use5...
Day 2: www.youtube.com/watch?v=rh2o...
Day 3: www.youtube.com/watch?v=9lzF...
Day 1: www.youtube.com/watch?v=Use5...
Day 2: www.youtube.com/watch?v=rh2o...
Day 3: www.youtube.com/watch?v=9lzF...
In this post, I share tools and habits that help me move quickly from idea to result without sacrificing reliability.
In this post, I share tools and habits that help me move quickly from idea to result without sacrificing reliability.
I've been having a lot of fun animating a mini-series about this topic, and the main part is now out.
youtu.be/j0wJBEZdwLs
I've been having a lot of fun animating a mini-series about this topic, and the main part is now out.
youtu.be/j0wJBEZdwLs
The 11 LLM archs covered in this video:
1. DeepSeek V3/R1
2. OLMo 2
3. Gemma 3
4. Mistral Small 3.1
5. Llama 4
6. Qwen3
7. SmolLM3
8. Kimi 2
9. GPT-OSS
10. Grok 2.5
11. GLM-4.5/4.6
www.youtube.com/watch?v=rNlU...
The 11 LLM archs covered in this video:
1. DeepSeek V3/R1
2. OLMo 2
3. Gemma 3
4. Mistral Small 3.1
5. Llama 4
6. Qwen3
7. SmolLM3
8. Kimi 2
9. GPT-OSS
10. Grok 2.5
11. GLM-4.5/4.6
www.youtube.com/watch?v=rNlU...
I added CNN support for PPO.
It turns out that using a shared features extractor (CNN in this case) is important for achieving good performance on Atari games.
Perf report: wandb.ai/openrlbenchm...
github.com/araffin/sbx
I added CNN support for PPO.
It turns out that using a shared features extractor (CNN in this case) is important for achieving good performance on Atari games.
Perf report: wandb.ai/openrlbenchm...
github.com/araffin/sbx
by Kaizhe Hu et al. (ToddlerBot Stanford)
Project page: robot-trains-robot.github.io
by Kaizhe Hu et al. (ToddlerBot Stanford)
Project page: robot-trains-robot.github.io
Website: open-hardware-robots.github.io/CoRL2025/
Website: open-hardware-robots.github.io/CoRL2025/
Following the success of the past iterations, we are opening the Call for Blog Posts 2026!
iclr-blogposts.github.io/2026/about/#...
Please retweet!
Following the success of the past iterations, we are opening the Call for Blog Posts 2026!
iclr-blogposts.github.io/2026/about/#...
Please retweet!
The plan is to start from tabular Q-learning and work our way up to Deep Q-learning (DQN). In a following post, I will continue on to Soft Actor-Critic (SAC) and its extensions.
araffin.github.io/post/rl102/
The plan is to start from tabular Q-learning and work our way up to Deep Q-learning (DQN). In a following post, I will continue on to Soft Actor-Critic (SAC) and its extensions.
[Original post on fosstodon.org]
[Original post on fosstodon.org]
araffin.github.io/post/rl102/
araffin.github.io/post/rl102/
And guess what? It’s not just for C++; Pixi plays nice with Python, Rust, ROS, Mojo, and beyond!
prefix.dev/blog/pixi-b...
And guess what? It’s not just for C++; Pixi plays nice with Python, Rust, ROS, Mojo, and beyond!
prefix.dev/blog/pixi-b...
permalink: wizardzines.com/comics/bash-...
from our zine "Bite Size Command Line": wizardzines.com/zines/bite-s...
permalink: wizardzines.com/comics/bash-...
from our zine "Bite Size Command Line": wizardzines.com/zines/bite-s...