Anton Obukhov
obukhov.ai
Anton Obukhov
@obukhov.ai
Research Scientist in Computer Vision and Generative AI
Pinned
Big Marigold update!
Last year, we showed how to turn Stable Diffusion 2 into a SOTA depth estimator with a few synthetic samples and 2–3 days on just 1 GPU.
Today's release features:
🏎️ 1-step inference
🔢 New modalities
🫣 High resolution
🧨 Diffusers support
🕹️ New demos
🧶👇
Big Marigold update!
Last year, we showed how to turn Stable Diffusion 2 into a SOTA depth estimator with a few synthetic samples and 2–3 days on just 1 GPU.
Today's release features:
🏎️ 1-step inference
🔢 New modalities
🫣 High resolution
🧨 Diffusers support
🕹️ New demos
🧶👇
May 15, 2025 at 4:23 PM
Reposted by Anton Obukhov
🍸🍸The TRICKY25 challenge: "Monocular Depth from Images of Specular and Transparent Surfaces" is live! 🍸🍸 Hosted at the 3rd TRICKY workshop #ICCV2025, with exciting speakers! @obukhov.ai @taiyasaki.bsky.social

Site: sites.google.com/view/iccv25t...
Codalab: codalab.lisn.upsaclay.fr/competitions...
May 14, 2025 at 9:18 AM
Huawei Research Center Zürich is looking for a Research Scientist intern to work with me on advancing foundation models for computer vision, focusing on enhancing computational photography features in mobile phones. ˙✧˖°📸⋆。˚

careers.huaweirc.ch/jobs/5702605...
Research Intern - Foundation Models for Computer Vision - Huawei Research Center Zürich
If you are enthusiastic in shaping Huawei’s European Research Institute together with a multicultural team of leading researchers, this is the right opportunity for you!
careers.huaweirc.ch
March 23, 2025 at 2:59 PM
Look at them stripes! A principled super-resolution drop by colleagues from PRS-ETH! Interactive demo with gradio-dualvision down in the post
We present Thera🔥: The new SOTA arbitrary-scale super-resolution method with built-in anti-aliasing. Our approach introduces Neural Heat Fields, which guarantee exact Gaussian filtering at any scale, enabling continuous image reconstruction without extra computational cost.
March 14, 2025 at 2:31 PM
RollingDepth rolls into Nashville for #CVPR2025! 🎸
February 28, 2025 at 10:26 AM
MDEC Challenge update! The 4th Monocular Depth Estimation Workshop at #CVPR2025 will be accepting submissions in two phases:
🚀 Dev phase: Feb 1 - Mar 1
🎯 Final phase: Mar 1 - Mar 21
Website: jspenmar.github.io/MDEC/
🌐 Codalab: codalab.lisn.upsaclay.fr/competitions...

Bring your best depth!
February 4, 2025 at 3:57 PM
Update about the 4th Monocular Depth Estimation Workshop at #CVPR2025:
🎉 Website is LIVE: jspenmar.github.io/MDEC/
🎉 Keynotes: Peter Wonka, Yiyi Liao, and Konrad Schindler
🎉 Challenge updates: new prediction types, baselines & metrics
January 31, 2025 at 7:23 PM
What's the next frontier after LLMs, that will demand nuclear-powered GPU clusters? No agents or AGI please
January 25, 2025 at 2:11 PM
The 4th Monocular Depth Estimation Challenge (MDEC) is coming to #CVPR2025, and I’m excited to join the org team! After 2024’s breakthroughs in monodepth driven by generative model advances in transformers and diffusion, this year's focus is on OOD generalization and evaluation.
December 21, 2024 at 3:52 PM
Reposted by Anton Obukhov
Monocular depth meets depth completion🚀 Check out our latest work where we modified Marigold to a zero-shot depth completion tool. Everything without retraining🌼 (This paper, for once, contains geese instead of cats😄 keep an eye open)
December 19, 2024 at 12:14 PM
Introducing ⇆ Marigold-DC — our training-free zero-shot approach to monocular Depth Completion with guided diffusion! If you have ever wondered how else a long denoising diffusion schedule can be useful, we have an answer for you! Details 🧵
December 19, 2024 at 1:52 AM
Reposted by Anton Obukhov
The most recent fads are "the mind is a like a computer" and now "the mind is like an LLM." 🤷‍♂️
December 17, 2024 at 2:44 PM
Reposted by Anton Obukhov
Introducing 👀Stereo4D👀

A method for mining 4D from internet stereo videos. It enables large-scale, high-quality, dynamic, *metric* 3D reconstructions, with camera poses and long-term 3D motion trajectories.

We used Stereo4D to make a dataset of over 100k real-world 4D scenes.
December 13, 2024 at 3:13 AM
Reposted by Anton Obukhov
3D illusions are fascinating! 🤩

But it takes exceptional artistic skills to make one.

We present Illusion3D - a simple method for creating 3D multiview illusions, where the interpretations change depending on your perspectives.

Let's play Where's Waldo, shall we? 😆
December 13, 2024 at 4:35 AM
Interesting! Switzerland continues to build its own Silicon Valley www.wired.com/story/openai...
OpenAI Poaches 3 Top Engineers From DeepMind
The new hires, all experts in computer vision, are the latest AI researchers to jump to a direct competitor in an intensively competitive talent market.
www.wired.com
December 4, 2024 at 6:40 AM
Introducing 🛹 RollingDepth 🛹 — a universal monocular depth estimator for arbitrarily long videos! Our paper, “Video Depth without Video Models,” delivers exactly that, setting new standards in temporal consistency. Check out more details in the thread 🧵
December 2, 2024 at 7:59 AM
Reposted by Anton Obukhov
The iPhone LiDAR depth is pretty stable but low-res and low detail. Monodepth is highly detailed but unstable ("depth flickering").

For more stable and detailed metric depth, I solve for the per-frame affine transform that optimally "anchors" the monodepth to the LiDAR.

youtube.com/shorts/u3OVj...
Anchoring monodepth to LiDAR depth | Pedestrian walk
YouTube video by Chris Offner
youtube.com
November 27, 2024 at 9:59 PM
🥁🥁🥁
@vincentleroy.bsky.social @chrisoffner3d.bsky.social
Soon, there might (or might not) be a Marigold framework that solves this problem👀
November 28, 2024 at 10:20 AM
Reposted by Anton Obukhov
Good morning!
Time to have a coffee ☕️ and update all the AI starter packs 🥹
November 23, 2024 at 2:04 PM
Can confirm, thanks for adding me!
My list is now FULL, 150/150. So many AWESOME people still joining this place!
November 23, 2024 at 2:07 PM
Check out my GenAI starter pack! go.bsky.app/BT1bRvZ
November 23, 2024 at 10:45 AM
Hello, science twitter!
November 22, 2024 at 11:33 PM