Lightnews — Scholar-powered news

Sandesh M

@sandeshm.bsky.social

Test Engineering expert with a keen interest in the ethical development and deployment of AI. I believe rigorous testing is crucial for building trust in AI

Posts Replies Media Videos

Sandesh M

@sandeshm.bsky.social

Grok thinks we have no chance. What a hater

April 30, 2025 at 9:49 PM

Sandesh M

@sandeshm.bsky.social

Claude is much more optimistic about the humans winning

April 30, 2025 at 9:44 PM

Sandesh M

@sandeshm.bsky.social

Deepseek thinks we could pull it off with heavy casualties

April 30, 2025 at 9:43 PM

Sandesh M

@sandeshm.bsky.social

Lmarena is the common source of head to head results but that is about user preference not raw capability. User preference can be affected by the agreeableness of the responses, and accuracy is not verified

February 24, 2025 at 9:55 PM

Sandesh M

@sandeshm.bsky.social

I wish there were standard capability benchmarks used across the different AI companies. Right now everyone seems to cherry pick the benchmarks they optimize for making it hard to directly compare all the options.

February 24, 2025 at 9:54 PM

Sandesh M

@sandeshm.bsky.social

The more powerful part of the video is when the entire stadium full of Quebecers belts out "Oh Canada" at the top of their lungs 🍁

February 16, 2025 at 11:24 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news