Anand Bhattad
banner
anandbhattad.bsky.social
Anand Bhattad
@anandbhattad.bsky.social
Incoming Assistant Professor at Johns Hopkins University | RAP at Toyota Technological Institute at Chicago | web: https://anandbhattad.github.io/ | Knowledge in Generative Image Models, Intrinsic Images, Image-based Relighting, Inverse Graphics
Thanks Andreas and the Scholar Inbox team! This is by far the best paper recommendation system I’ve come across. No more digging through overwhelming volumes and like the blog says, the right papers just show up in my inbox.
June 30, 2025 at 2:47 PM
This is probably one of the best talks and slides I have ever seen. I was lucky to see this live! Great talk again :)
June 23, 2025 at 7:24 PM
A special shout-out to all the job-market candidates this year: it’s been tough with interviews canceled and hiring freezes🙏

After UIUC's blue and @tticconnect.bsky.social blue, I’m delighted to add another shade of blue to my journey at Hopkins @jhucompsci.bsky.social. Super excited!!
June 2, 2025 at 7:46 PM
We will be recruiting PhD students, postdocs, and interns. Updates soon on my website: anandbhattad.github.io

Also, feel free to chat with me @cvprconference.bsky.social #CVPR2025

I’m immensely grateful to my mentors, friends, colleagues, and family for their unwavering support.🙏
Anand Bhattad - Research Assistant Professor
anandbhattad.github.io
June 2, 2025 at 7:46 PM
At JHU, I'll be starting a new lab: 3P Vision Group. The “3Ps” are Pixels, Perception & Physics.

The lab will focus on 3 broad themes:

1) GLOW: Generative Learning Of Worlds
2) LUMA: Learning, Understanding, & Modeling of Appearances
3) PULSE: Physical Understanding and Learning of Scene Events
June 2, 2025 at 7:46 PM
[2/2] However, if we treat 3D as a real task, such as building a usable environment, then these projective geometry details matter. It also ties nicely to Ross Girshick’s talk at our RetroCV CVPR workshop last year, which you highlighted.
April 29, 2025 at 4:56 PM
[1/2] Thanks for the great talk and for sharing it online for those who couldn't attend 3DV. I liked your points on our "Shadows Don't Lie" paper. I agree that if the goal is simply to render 3D pixels, then subtle projective geometry errors that are imperceptible to humans are not a major concern.
April 29, 2025 at 4:56 PM
Congratulations and welcome to TTIC! 🥳🎉
April 15, 2025 at 1:03 PM
By “remove,” I meant masking the object and using inpainting to hallucinate what could be there instead.
April 2, 2025 at 5:08 AM
Thanks Noah! Glad you liked it :)
April 2, 2025 at 4:51 AM
[2/2] We also re-run the full pipeline *after each removal*. This matters: new objects can appear, occluded ones can become visible, etc, making the process adaptive and less ambiguous.

Fig above shows a single pass. Once the top bowl is gone, the next "top" bowl gets its own diverse semantics too
April 2, 2025 at 4:49 AM
[1/2] Not really... there's quite a bit of variation.

When we remove the top bowl, we get diverse semantics: fruits, plants, and other objects that just happen to fit the shape. As we go down, it becomes less diverse: occasional flowers, new bowls in the middle, & finally just bowls at the bottom.
April 2, 2025 at 4:49 AM
[10/10] This project began while I was visiting Berkeley last summer. Huge thanks to Alyosha for the mentorship and to my amazing co-author Konpat Preechakul. We hope this inspires you to think differently about what it means to understand a scene.

🔗 visualjenga.github.io
📄 arxiv.org/abs/2503.21770
Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting
Visual Jenga is a new scene understanding task where the goal is to remove objects one by one from a single image while keeping the rest of the scene stable. We introduce a simple baseline that uses a...
visualjenga.github.io
March 29, 2025 at 7:36 PM
[9/10] Visual Jenga is a call to rethink what scene understanding should mean in 2025 and beyond.

We’re just getting started. There’s still a long way to go before models understand scenes like humans do. Our task is a small, playful, and rigorous step in that direction.
March 29, 2025 at 7:36 PM
[8/10] This simple idea surprisingly scales to a wide range of scenes: from clean setups like a cat on a table or a stack of bowls... to messy, real-world scenes (yes, even Alyosha’s office).
March 29, 2025 at 7:36 PM
[7/10] Why does this work? Because generative models have internalized asymmetries in the visual world.

Search for “cups” → You’ll almost always see a table.
Search for “tables” → You rarely see cups.

So: P(table | cup) ≫ P(cup | table)

We exploit this asymmetry to guide counterfactual inpainting
March 29, 2025 at 7:36 PM