Excited to announce our new work: "Large-scale Pre-training for Grounded Video Caption Generation" with Cordelia Schmid & @josef-sivic.bsky.social.
Paper: arxiv.org/abs/2503.10781
Project: ekazakos.github.io/grounded_vid...
Code (coming soon): github.com/ekazakos/grove 1/7
We will release code, models and datasets within next 2 weeks.
We are also working on a search demo for the proposed datasets with user prompts!
I hope to see you all in Honolulu!
📽️ All 15 talk recordings are now online 🚀! tinyurl.com/4ns5apvd
Catch up on cutting-edge work in #robot perception, action & autonomy - from #SLAM & control to computer vision & large-scale learning!
📽️ All 15 talk recordings are now online 🚀! tinyurl.com/4ns5apvd
Catch up on cutting-edge work in #robot perception, action & autonomy - from #SLAM & control to computer vision & large-scale learning!
A sight seen for the first time in recorded history. ⌚
A sight seen for the first time in recorded history. ⌚
This may be the most serious issue I have ever seen in the peer-review systems
However, as pointed out here, we should immediately report the bug to the team, instead of sharing in public or keeping silent
Very disappointing that some people actively exploited the vulnerability
This may be the most serious issue I have ever seen in the peer-review systems
However, as pointed out here, we should immediately report the bug to the team, instead of sharing in public or keeping silent
Very disappointing that some people actively exploited the vulnerability
📍 Copenhagen 🇩🇰
📅 Apply by Feb 1st
🔗 https://employment.ku.dk/faculty/?show=153139
📍 Copenhagen 🇩🇰
📅 Apply by Feb 1st
🔗 https://employment.ku.dk/faculty/?show=153139
TLDR; Spiritual successor to CroCo with a simpler multi-view objective and larger scale. Beats DINOv3 and CroCo v2 in RoMa, feedforward reconstruction, and rel. pose.
arxiv.org/abs/2511.17309
github.com/davnords/mum
TLDR; Spiritual successor to CroCo with a simpler multi-view objective and larger scale. Beats DINOv3 and CroCo v2 in RoMa, feedforward reconstruction, and rel. pose.
arxiv.org/abs/2511.17309
github.com/davnords/mum
I LOVE THE CZECH REPUBLIC! 🇨🇿
I LOVE THE CZECH REPUBLIC! 🇨🇿
⚡100x Training Throughput
🎯Fast Convergence
🔢Pure Int8 Pretraining of RNN LLMs
Large Behavior Models (LBM) by TRI
Presented by Adrien
toyotaresearchinstitute.github.io/lbm1/
Large Behavior Models (LBM) by TRI
Presented by Adrien
toyotaresearchinstitute.github.io/lbm1/
#AcademicSky ⚗️ 🧪
- Latent long/short-term memory
- Continual learning on experience (not datasets)
- Exploration and information gathering
- Counterfactual world models from sensors
- Sensory abstraction facilitating reasoning
- Long-horizon planning
- Latent long/short-term memory
- Continual learning on experience (not datasets)
- Exploration and information gathering
- Counterfactual world models from sensors
- Sensory abstraction facilitating reasoning
- Long-horizon planning
My personal reaction is no. We've made tremendous progress scaling and improving distributional learning & other existing solutions, but not on cracking hard open problems.
My personal reaction is no. We've made tremendous progress scaling and improving distributional learning & other existing solutions, but not on cracking hard open problems.
DreamCoder-like robot skill learning. Refactoring helps!
PDF: arxiv.org/abs/2406.18746
DreamCoder-like robot skill learning. Refactoring helps!
PDF: arxiv.org/abs/2406.18746