Lightnews — Scholar-powered news

Anand Bhattad

@anandbhattad.bsky.social

A special shout-out to all the job-market candidates this year: it’s been tough with interviews canceled and hiring freezes🙏

After UIUC's blue and @tticconnect.bsky.social blue, I’m delighted to add another shade of blue to my journey at Hopkins @jhucompsci.bsky.social. Super excited!!

June 2, 2025 at 7:46 PM

Anand Bhattad

@anandbhattad.bsky.social

I’m thrilled to share that I will be joining Johns Hopkins University’s Department of Computer Science (@jhucompsci.bsky.social, @hopkinsdsai.bsky.social) as an Assistant Professor this fall.

June 2, 2025 at 7:46 PM

Anand Bhattad

@anandbhattad.bsky.social

[1/2] Not really... there's quite a bit of variation.

When we remove the top bowl, we get diverse semantics: fruits, plants, and other objects that just happen to fit the shape. As we go down, it becomes less diverse: occasional flowers, new bowls in the middle, & finally just bowls at the bottom.

April 2, 2025 at 4:49 AM

Anand Bhattad

@anandbhattad.bsky.social

[8/10] This simple idea surprisingly scales to a wide range of scenes: from clean setups like a cat on a table or a stack of bowls... to messy, real-world scenes (yes, even Alyosha’s office).

March 29, 2025 at 7:36 PM

Anand Bhattad

@anandbhattad.bsky.social

[7/10] Why does this work? Because generative models have internalized asymmetries in the visual world.

Search for “cups” → You’ll almost always see a table.
Search for “tables” → You rarely see cups.

So: P(table | cup) ≫ P(cup | table)

We exploit this asymmetry to guide counterfactual inpainting

March 29, 2025 at 7:36 PM

Anand Bhattad

@anandbhattad.bsky.social

[6/10] We measure dependencies by masking each object, then using a large inpainting model to hallucinate what should be there. If the replacements are diverse, the object likely isn't critical. If it consistently reappears, like the table under the cat, it’s probably a support.

March 29, 2025 at 7:36 PM

Anand Bhattad

@anandbhattad.bsky.social

[1/10] Is scene understanding solved?

Models today can label pixels and detect objects with high accuracy. But does that mean they truly understand scenes?

Super excited to share our new paper and a new task in computer vision: Visual Jenga!

📄 arxiv.org/abs/2503.21770
🔗 visualjenga.github.io

March 29, 2025 at 7:36 PM

Anand Bhattad

@anandbhattad.bsky.social

I can’t believe this! Mind-blowing! There are small errors (a flipped logo, rotated chairs), but still, this is incredible!!

Xiaoyan, who’s been working with me on relighting, sent this over. It’s one of the hardest examples we’ve consistently used to stress-test LumiNet: luminet-relight.github.io

March 27, 2025 at 8:00 PM

Anand Bhattad

@anandbhattad.bsky.social

Can we create realistic renderings of urban scenes from a single video while enabling controllable editing: relighting, object compositing, and nighttime simulation?

Check out our #3DV2025 UrbanIR paper, led by @chih-hao.bsky.social that does exactly this.

🔗: urbaninverserendering.github.io

March 16, 2025 at 3:39 AM

Anand Bhattad

@anandbhattad.bsky.social

Exciting work led by Zitian & @freudfc.bsky.social. This is their first Computer Vision paper acceptance! 🚀

w/ Mathieu Garon and @jflalonde.bsky.social

February 28, 2025 at 6:32 PM

Anand Bhattad

@anandbhattad.bsky.social

ZeroComp will be presented as an Oral at #WACV2025!

Train a diffusion model as a renderer that takes intrinsic images as input. Once trained, we can perform zero-shot object compositing & can easily extend this to other object editing tasks, such as material swapping.
arxiv.org/abs/2410.08168

February 28, 2025 at 6:32 PM

Anand Bhattad

@anandbhattad.bsky.social

Thanks! Our workshop has a broader goal as you mentioned. We hope to see you there :)

sites.google.com/view/standou...

January 14, 2025 at 10:37 AM

Anand Bhattad

@anandbhattad.bsky.social

A short description of our workshop is on our website.

Here's what "Standing Out" means to us:
"A forum to reflect on personal growth & unique identity in a rapidly evolving field."

The goal is to help researchers find their authentic voice while building a more vibrant, inclusive CV community.

January 14, 2025 at 10:33 AM

Anand Bhattad

@anandbhattad.bsky.social

3/3 Bringing this to you with the help of my amazing co-organizers: Aditya Prakash, @unnatjain.bsky.social, @akanazawa.bsky.social, Georgia Gkioxari, & Lana Lazebnink

⚠️ Important: When registering for #CVPR2025, please check-mark our workshop in the interest form!
@cvprconference.bsky.social

January 13, 2025 at 4:16 PM

Anand Bhattad

@anandbhattad.bsky.social

🧵 1/3 Many at #CVPR2024 & #ECCV2024 asked what would be next in our workshop series.

We're excited to announce "How to Stand Out in the Crowd?" at #CVPR2025 Nashville - our 4th community-building workshop featuring this incredible speaker lineup!

🔗 sites.google.com/view/standou...

January 13, 2025 at 4:16 PM

Anand Bhattad

@anandbhattad.bsky.social

[1/2] Latent Intrinsics Emerge from Training to Relight.

We discovered an easy way to extract albedo without any albedo-like images. By questioning why intrinsic image representations must be 3-channel maps, we trained a simple relighting model where intrinsics & lighting are latent variables.

December 11, 2024 at 10:05 PM

Anand Bhattad

@anandbhattad.bsky.social

[2/3] Some of the results are compelling—light flares as sunlight pierces through the edge of the building, reflections on glass facades, drainage, stop signs appearing, and even cars on the street that weren’t visible in the original frame. There is no explicit geometry required.

December 11, 2024 at 9:32 PM

Anand Bhattad

@anandbhattad.bsky.social

[1/3] From a single image to the 3D world—it’s possible with the right data, & we have it for you🚀

We’re thrilled to release our 360° video dataset. Training a simple conditional diffusion model with explicit camera control can synthesize novel 3D scenes—all from a single image input! #NeurIPS2024

December 11, 2024 at 9:32 PM

Anand Bhattad

@anandbhattad.bsky.social

Spotlight Poster #NeurIPS2024

📍 East Exhibit Hall A-C, #1500
🗓️ Wed 11 Dec, 16:30 – 19:30 PT

Latent Intrinsics: Intrinsic image representation as latent variables, resulting in emergent albedo-like maps for free.

Code is now available: github.com/xiao7199/Lat...
arXiv: arxiv.org/abs/2405.21074

December 11, 2024 at 5:55 AM

Anand Bhattad

@anandbhattad.bsky.social

Our latest work is on arXiv now and the thread is updated with the movies :)

arXiv: arxiv.org/abs/2412.00177
project: luminet-relight.github.io

December 5, 2024 at 4:01 PM

Anand Bhattad

@anandbhattad.bsky.social

For more results and details check out our paper!

📝: Paper link: arxiv.org/abs/2412.00177
🔗 :Project page: luminet-relight.github.io
🐦: Twitter thread (longer version):https://x.com/anand_bhattad/status/1864479658353242485

Work led by Xiaoyan Xing (PhD student at the University of Amsterdam)!

December 5, 2024 at 3:58 PM

Anand Bhattad

@anandbhattad.bsky.social

Here are a few more random relighting! ✨

How accurate are these results? That's very hard to tell at the moment 🤔

But our tests on the MIT data, our user study, plus our qualitative results all point to us being on the right track. Gen models seem to know about how light interacts with our world.

December 5, 2024 at 3:58 PM

Anand Bhattad

@anandbhattad.bsky.social

Quantitative results 📊

Evaluation is done on challenging MIT multi-illum dataset (only real data we have with GT and similar lighting conditions across scenes). By combining latent intrinsic with generative models, we cut down the error (RMSE) by 35%! See our paper for qualitative results.

December 5, 2024 at 3:58 PM

Anand Bhattad

@anandbhattad.bsky.social

Where do we get data for training relighting models? 🤔

We used our good old StyLitGAN from #CVPR2024 to generate diverse training data and filter it to get the top 1000 using CLIP. We combine this with the existing MIT Multi-Illum & Big Time datasets. About ~2500 unique images make our training set

December 5, 2024 at 3:58 PM

Anand Bhattad

@anandbhattad.bsky.social

LumiNet's architecture is quite simple! 🏗️

Given two images (source image to be relit and target lighting condition), we first extract latent intrinsic image representations using our NeurIPS2024 work for both these images and then train a simple latent ControlNet.

December 5, 2024 at 3:58 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news