Shyamgopal Karthik
shyamgopal.bsky.social
Shyamgopal Karthik
@shyamgopal.bsky.social
PhD at Tübingen. Working on post-training diffusion and multimodal models. Previous research interns at Snapchat and Naver Labs.
https://sgk98.github.io/
Earlier this year, we'd spent a lot of time pushing the limits of blind baselines for vision-langauge compositionality benchmarks and found that they're surprisingly close to state-of-the-art on several benchmarks, and that filtering samples wasn't a great solution.
Link: arxiv.org/abs/2506.08227
November 25, 2025 at 7:48 AM
Was a very fun (and quick) investigation into biases of multimodal benchmarks, this time on tasks designed for "Spatial Supersensing" introduced by Cambrian-S with some great folks!
November 25, 2025 at 7:48 AM
I'm in Nashville this week attending #CVPR2025. Excited to discuss post-training VLMs and diffusion models!
June 11, 2025 at 3:04 AM
After a break of over 2 years, I'm attending a conference again! Excited to attend NeurIPS, even more so to be presenting ReNO, getting inference-time scaling and preference optimization to work for text-to-image generation.
Do reach out if you'd like to chat!
December 9, 2024 at 9:27 PM