Lightnews — Scholar-powered news

Aarash Feizi

@aarashfeizi.bsky.social

Visiting Researcher at @ServiceNowRSRCH | PhD student in @mcgillu and @Mila_Quebec | Prev. @RecursionPharma

https://aarashfeizi.github.io/

Posts Replies Media Videos

Aarash Feizi

@aarashfeizi.bsky.social

🧵 5/7

✅ PairBench correlates strongly with existing benchmarks, meaning it can serve as a low-cost alternative to expensive human-annotated benchmarks!

This makes it easier to compare and rank models efficiently—without excessive computational costs.

February 27, 2025 at 7:54 PM

Aarash Feizi

@aarashfeizi.bsky.social

🚨 Excited to introduce PairBench! 🚨

💡 TL;DR: VLM-judges can fail at data comparison!

✅ PairBench helps you pick the right one by testing alignment, symmetry, smoothness & controllability—ensuring reliable auto-evaluation.

📄 Paper: arxiv.org/abs/2502.15210

🧵 Thread: 👇

February 27, 2025 at 7:50 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news