Lightnews — Scholar-powered news

Julius Cheng

@juliuscheng.bsky.social

Finishing up PhD in NLP at University of Cambridge. Deciding whether to put my weirdo ML thoughts on here or just be normal

Posts Replies Media Videos

Julius Cheng

@juliuscheng.bsky.social

They say it's something like 20-30% but 0% of my papers get accepted!! Something definitely wrong here

January 23, 2025 at 11:05 AM

Julius Cheng

@juliuscheng.bsky.social

Our experiments are on machine translation, but this method works with any generator + reranker setup!

Eager to hear your thoughts, and happy reranking!

January 23, 2025 at 1:32 AM

Julius Cheng

@juliuscheng.bsky.social

Bonus: we show how to use multi-fidelity Bayesian optimization to use a smaller and faster proxy scoring model to search even more efficiently. We get the best performance by training a distilled model from our main CometKiwi model.

January 23, 2025 at 1:32 AM

Julius Cheng

@juliuscheng.bsky.social

The candidate pool is actually a search space, and you can model your uncertainty about scores you haven't scored yet with GP regression. Use BayesOpt to search the pool for promising candidates.

This nearly gets the maximum achievable score with only 70/200 scoring calls!

January 23, 2025 at 1:32 AM

Julius Cheng

@juliuscheng.bsky.social

Reranking is expensive and we show that you don't need to score every candidate in the candidate pool.

Use Bayesian optimization with GPs!

January 23, 2025 at 1:32 AM

Julius Cheng

@juliuscheng.bsky.social

Language models for MT are good at generating large candidate pools that contain good translations; they're less good at assigning the highest score to the best translation.

This is where reranking comes in: rescoring with COMET, noisy channel decoding, minimum Bayes risk, etc.

January 23, 2025 at 1:32 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news