Zeeshan Zia
zeeshanzia.bsky.social
Zeeshan Zia
@zeeshanzia.bsky.social
AI and Computer Vision @Retrocausal. Ex-Microsoft Research, Imperial College London, ETH-Zurich
Pinned
A late intro as I try to rebuild my network in this new country!

I am a Computer Vision scientist building a company to help manufacturing supervisors optimize assembly line operations.

I did my PhD from ETH Zurich in 2013, lifting object detections in single images to 3D scene representations.
Reposted by Zeeshan Zia
Someone wrote that you never really love another person; just a hash of them, a one-way transformation into your own experience. You never really know what was going on inside their head; its fundamentally unknowable. Picture unrelated
May 2, 2025 at 1:44 PM
Reposted by Zeeshan Zia
Right Now, most of the American can relate to this message from the 90's.
April 19, 2025 at 3:28 PM
Reposted by Zeeshan Zia
This is the kind of hard-hitting analysis i want to see. How many iclr papers is a paper in your area worth?
April 4, 2025 at 3:41 PM
Reposted by Zeeshan Zia
[1/10] Is scene understanding solved?

Models today can label pixels and detect objects with high accuracy. But does that mean they truly understand scenes?

Super excited to share our new paper and a new task in computer vision: Visual Jenga!

📄 arxiv.org/abs/2503.21770
🔗 visualjenga.github.io
March 29, 2025 at 7:36 PM
Reposted by Zeeshan Zia
I never hear humanoid robots startups talking about stereo matching or depth estimation or lidar, which I find a bit odd. Maybe it's cooler to talk about VLMs?
March 15, 2025 at 5:51 PM
Reposted by Zeeshan Zia
👀
March 5, 2025 at 7:39 PM
Reposted by Zeeshan Zia
It finally dawned on me. I couldn't figure out what was going in Washington, plus all the hype about humanoids, including the new genre of "humanoid theater" videos, unmoored from reality. Then I realized, we must indeed be living in a simulation. Scripted by an LLM. Our world is a confabulation!
February 27, 2025 at 8:24 PM
Reposted by Zeeshan Zia
Every single shot of teleoperated robots needs to have a clearly visible “TELEOPERATED” label in the corner! Teleoperated demos are great to show off your impressive hardware but it’s grossly misleading to let people think they’re watching an autonomous robot, driven by AI, when they aren’t.
February 22, 2025 at 8:23 AM
Reposted by Zeeshan Zia
Half of “prompt engineering” was actually just prompting LLMs to act like Reasoners before the labs realized that was a thing.

(Chain of thought/think step-by-step was the first powerful prompting technique that was discovered, now Reasoners do it automatically
February 13, 2025 at 12:44 AM
Reposted by Zeeshan Zia
I asked R1 to divide 3442524 by 31244. It thought for almost five minutes, and came up with answer that was correct, but only up to two decimals. Reading its reasoning is fascinating, because of how un-computer-like it is.
January 23, 2025 at 6:36 PM
💯

"Virtual employees" formulation is being pushed by sales teams as it establishes a high price point for the AI tool and is easy to fit in existing enterprise budgets. Just replace the headcount with the AI employee.

But it's a terrible UX, and that's why it's going to fail.
AI in the workplace is going to show up integrated in software tools like Salesforce and Canva, not as ‘virtual employees’.
‘Virtual employees’ could join workforce as soon as this year, OpenAI boss says
January 8, 2025 at 10:04 AM
A late intro as I try to rebuild my network in this new country!

I am a Computer Vision scientist building a company to help manufacturing supervisors optimize assembly line operations.

I did my PhD from ETH Zurich in 2013, lifting object detections in single images to 3D scene representations.
January 8, 2025 at 7:06 AM
Watching a long video but only peeking at a few frames, it’s easy to miss details and lose track of events. That’s the core issue video models face—leading to incomplete or inaccurate predictions.

Our latest work addresses this problem!

YT: youtu.be/lEUluMdNHcc

arXiv: arxiv.org/abs/2412.02930
Video LLMs for Temporal Reasoning in Long Videos
YouTube video by Retrocausal AI
youtu.be
January 6, 2025 at 4:07 PM