visual computing, 3D vision, spatial AI, machine learning, robot perception.
📍Zurich, Switzerland
I will give three (very different) talks at workshops and tutorials, see info below.
We also present two papers, ACE-G and SCR Priors.
And it's the 10th (!) anniversary of the R6D workshop, which we co-organize.
I will give three (very different) talks at workshops and tutorials, see info below.
We also present two papers, ACE-G and SCR Priors.
And it's the 10th (!) anniversary of the R6D workshop, which we co-organize.
TTT3R offers a simple state update rule to enhance length generalization for #CUT3R — No fine-tuning required!
🔗Page: rover-xingyu.github.io/TTT3R
We rebuilt @taylorswift13’s "22" live at the 2013 Billboard Music Awards - in 3D!
TTT3R offers a simple state update rule to enhance length generalization for #CUT3R — No fine-tuning required!
🔗Page: rover-xingyu.github.io/TTT3R
We rebuilt @taylorswift13’s "22" live at the 2013 Billboard Music Awards - in 3D!
JUPITER, launched in Germany, is the EU’s most powerful system and fourth fastest worldwide.
100% powered by renewables, it has also ranked first in energy efficiency. It will boost AI, science, and climate research.
Read more - europa.eu/!vcWBqW
JUPITER, launched in Germany, is the EU’s most powerful system and fourth fastest worldwide.
100% powered by renewables, it has also ranked first in energy efficiency. It will boost AI, science, and climate research.
Read more - europa.eu/!vcWBqW
The whole idea of an autoencoder is that you complete a round trip and seek cycle consistency—why lay out the network linearly?
The whole idea of an autoencoder is that you complete a round trip and seek cycle consistency—why lay out the network linearly?
www.youtube.com/watch?v=mayo...
www.youtube.com/watch?v=mayo...
I share the frustration. It's disempowering when most major progress recently is downstream of "foundation models" that you don't have the compute or data to train yourself.
I share the frustration. It's disempowering when most major progress recently is downstream of "foundation models" that you don't have the compute or data to train yourself.
- The main 21st century story is US v. China.
- The US thus needs to focus on the Pacific.
- They need to peel Russia off of China and make it an ally.
- If this happens at the cost of the Europeans, so be it.
- Europe is useless as an ally and harmless as an adversary.
- The main 21st century story is US v. China.
- The US thus needs to focus on the Pacific.
- They need to peel Russia off of China and make it an ally.
- If this happens at the cost of the Europeans, so be it.
- Europe is useless as an ally and harmless as an adversary.
SigLIP (VLMs) and DINO are two competing paradigms for image encoders.
My intuition is that joint vision-language modeling works great for semantic problems but may be too coarse for geometry problems like SfM or SLAM.
Most animals navigate 3D space perfectly without language.
SigLIP (VLMs) and DINO are two competing paradigms for image encoders.
My intuition is that joint vision-language modeling works great for semantic problems but may be too coarse for geometry problems like SfM or SLAM.
Most animals navigate 3D space perfectly without language.
We use it a lot already, I recommend it.
(keynote @ Paris by @dlarlus.bsky.social )
We use it a lot already, I recommend it.
(keynote @ Paris by @dlarlus.bsky.social )
But I watch in total awe how it writes in a few seconds a well documented program in a language/API I don't know, while they complain "but it might have a bug and requires a pass or two" O_o
But I watch in total awe how it writes in a few seconds a well documented program in a language/API I don't know, while they complain "but it might have a bug and requires a pass or two" O_o
youtu.be/US2gO7UYEfY
youtu.be/US2gO7UYEfY
www.theverge.com/news/669238/...
www.theverge.com/news/669238/...
cvpr.thecvf.com/Conferences/...