Dimitrije Antić
anticdimi.bsky.social
Dimitrije Antić
@anticdimi.bsky.social
CV & ML Ph.D. student at @uva.nl | prev. Univ. of Tuebingen, MPI-IS | Teaching machines to perceive humans. | anticdimi.github.io
Reposted by Dimitrije Antić
To bridge this 2D-to-3D gap, we propose "Render-Localize-Lift":
- Render: 3D human/object meshes into multiview 2D images.
- Localize: A Multiview Localization (MV-Loc) model, guided by VLM tokens, predicts 2D contact masks.
- Lift: 2D contact masks to 3D.
(5/10)
June 15, 2025 at 12:23 PM
Reposted by Dimitrije Antić
How can we infer 3D contact with limited 3D data? InteractVLM exploits foundational models—a VLM & localization model fine tuned to reason about contact. Given an image & prompt, the VLM outputs tokens for localization. But these models work in 2D, while contact is 3D. (4/10)
June 15, 2025 at 12:23 PM
Reposted by Dimitrije Antić
Why does 3D human-object reconstruction fail in the wild or get limited to a few object classes? A key missing piece is accurate 3D contact. InteractVLM (#CVPR2025) uses foundational models to infer contact on humans & objects, improving reconstruction from a single image. (1/10)
June 15, 2025 at 12:23 PM
Reposted by Dimitrije Antić
📢 Short deadline extension (24/2) -- One more week left to submit your application!
February 16, 2025 at 10:42 PM
Passionate about Human-centric Computer Vision? 📸🤖
We’re looking for motivated PhD candidates to join our dynamic team! 🚀
January 26, 2025 at 5:54 PM