Lightnews — Scholar-powered news

Reposted by Chandler Smith

@ankareuel.bsky.social

Submitting a benchmark to
ICML? Check out our NeurIPS Spotlight paper BetterBench! We outline best practices for benchmark design, implementation & reporting to help shift community norms. Be part of the change! 🙌

+ Add your benchmark to our database for visibility: betterbench.stanford.edu

Anka Reuel ➡️ NeurIPS @ankareuel.bsky.social · Nov 25

🚨 NeurIPS 2024 Spotlight
Did you know we lack standards for AI benchmarks, despite their role in tracking progress, comparing models, and shaping policy? 🤯 Enter BetterBench–our framework with 46 criteria to assess benchmark quality: betterbench.stanford.edu 1/x

January 27, 2025 at 10:02 PM

Reposted by Chandler Smith

Vincent Conitzer

@conitzer.bsky.social

The 2025 Cooperative AI summer school (9-13 July 2025 near London) is now accepting applications, due March 7th!
www.cooperativeai.com/summer-schoo...

Cooperative AI

www.cooperativeai.com

January 9, 2025 at 7:25 PM

Chandler Smith

@chansmi.bsky.social

Very excited to read this!

Chris Amato @cjdamato.bsky.social · Jan 7

I have a draft of my introduction to cooperative multi-agent reinforcement learning on arxiv. Check it out and let me know any feedback you have. The plan is to polish and extend the material into a more comprehensive text with Frans Oliehoek.

arxiv.org/abs/2405.06161

A First Introduction to Cooperative Multi-Agent Reinforcement Learning

Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years. While numerous approaches have been developed, they can be broadly categorized into three main types: centralized ...

arxiv.org

January 7, 2025 at 4:40 PM

Chandler Smith

@chansmi.bsky.social

On my way to NeurIPS ‘24 ✈️ to present our Spotlight paper Betterbench and the Concordia Contest!

Would love to connect with folks and chat anything multi-agent, agentic AI, benchmarking, etc.

I am applying for fall ‘25 PhDs. Ping me if you have advice or there may be a fit!

December 9, 2024 at 7:44 PM

Chandler Smith

@chansmi.bsky.social

🚀🚨 Excited to announce our work on Multi-Agent LLM Training!

MALT is a multi-agent configuration that leverages synthetic data generation and credit assignment strategies for post-training specialized models solving problems together

December 6, 2024 at 10:38 PM

Chandler Smith

@chansmi.bsky.social

🚀 Check out our @neuripsconf.bsky.social Spotlight paper Betterbench, which outlines new standards in benchmarking AI! Delighted to have it featured in
@techreviewjp.bsky.social

Anka Reuel ➡️ NeurIPS @ankareuel.bsky.social · Nov 26

Thrilled our NeurIPS Spotlight paper BetterBench is featured by MIT @technologyreview.com! 🎉

Article: bit.ly/3Zo1rgw
Paper: bit.ly/4eMSZfw
Website & Scores: betterbench.stanford.edu

Please share widely & join us in setting new standards for better AI benchmarking! ❤️

MIT Technology Review @technologyreview.com · Nov 26

Many of the most popular benchmarks for AI models are outdated or poorly designed.

November 26, 2024 at 5:30 PM

Reposted by Chandler Smith

Anka Reuel ➡️ NeurIPS

@ankareuel.bsky.social

🚨 NeurIPS 2024 Spotlight
Did you know we lack standards for AI benchmarks, despite their role in tracking progress, comparing models, and shaping policy? 🤯 Enter BetterBench–our framework with 46 criteria to assess benchmark quality: betterbench.stanford.edu 1/x

November 25, 2024 at 7:02 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news