Lightnews — Scholar-powered news

Weijie Su

@wjsu.bsky.social

170 followers 31 following 17 posts

Associate Professor at University of Pennsylvania

Posts Replies Media Videos

Weijie Su

@wjsu.bsky.social

Another new paper that is follow-up:

arxiv.org/abs/2505.20627

It studies an alternative to RLHF: Nash learning from human feedback.

Fundamental Limits of Game-Theoretic LLM Alignment: Smith Consistency and Preference Matching

Nash Learning from Human Feedback is a game-theoretic framework for aligning large language models (LLMs) with human preferences by modeling learning as a two-player zero-sum game. However, using raw ...

arxiv.org

May 30, 2025 at 4:00 PM

Weijie Su

@wjsu.bsky.social

Our analysis shows that it is natural to use the polar decomposition from a defining viewpoint. This gives rise to nuclear norm scaling: the update will vanish as the gradient becomes small, automatically! In contrast, Muon needs to manually tune the factor for the ortho matrix to achieve this.

May 29, 2025 at 5:13 PM

Weijie Su

@wjsu.bsky.social

Some context: www.weijie-su.com/llm/

Statistical Foundations of Large Language Models

www.weijie-su.com

May 29, 2025 at 1:17 PM

Weijie Su

@wjsu.bsky.social

The ranking method was tested at ICML in 2023, 2024, and 2025. I hope we'll finally use it to improve ML/AI review processes soon. Here's an article about the method, from its conception to experimentation:

www.weijie-su.com/openrank/

How to Prevent a Tragedy of the Commons for AI Research?

www.weijie-su.com

May 27, 2025 at 5:08 PM

Weijie Su

@wjsu.bsky.social

Add me plz. Thx!

November 28, 2024 at 1:29 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news