Lightnews — Scholar-powered news

Chandler Smith

@chansmi.bsky.social

59 followers 230 following 17 posts

Multi-Agent Researcher at CAIF | applied research at IQT | Thinking about making MA systems go well

Posts Replies Media Videos

Chandler Smith

@chansmi.bsky.social

We see very strong performance across MATH, GSM8k, and CommonsenseQA against trained and untrained baselines with Llama 3.1 8B!

December 6, 2024 at 10:38 PM

Chandler Smith

@chansmi.bsky.social

By just looking at these trees, how do you tell which branches are useful for post-training without human feedback or trained PRMs? Value iteration can be used as a simple approach to propagate labels throughout branches with a thresholding factor to label the quality of reasoning steps.

December 6, 2024 at 10:38 PM

Chandler Smith

@chansmi.bsky.social

Our goal was to develop techniques where a system of multiple models could be trained together. We use a generator, critic, and refinement setting that mimics how humans might interact with LLMs.

December 6, 2024 at 10:38 PM

Chandler Smith

@chansmi.bsky.social

🚀🚨 Excited to announce our work on Multi-Agent LLM Training!

MALT is a multi-agent configuration that leverages synthetic data generation and credit assignment strategies for post-training specialized models solving problems together

December 6, 2024 at 10:38 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news