Lightnews — Scholar-powered news

Reposted by Vignesh Padmanabhan

Nathan Lambert

@natolambert.bsky.social

A common trend across recent research in using reinforcement learning to train reasoning models is that the clipping operation within a trust region (core to PPO, adopted by GRPO) is squashing rare tokens that are key to clever behaviors like verification or backtracking.

June 17, 2025 at 2:38 AM

Vignesh Padmanabhan

@ai-slayer.bsky.social

aistudio.instagram.com/ai/636244858...

ML Maestro · AI by a.i.slayer

Your AI engineering expert

aistudio.instagram.com

February 1, 2025 at 7:52 PM

Reposted by Vignesh Padmanabhan

Olivier Grisel

@ogrisel.bsky.social

I recently shared some of my reflections on how to use probabilistic classifiers for optimal decision-making under uncertainty at @pydataparis.bsky.social 2024.

Here is the recording of the presentation:

www.youtube.com/watch?v=-gYn...

A high-level summary diagram taken from the slides linked below. It shows the interplay of two main components: a probabilistic model and decision maker or planner.

Probabilistic predictions of an underfitting polynomial classifier on a noisy XOR task and the corresponding under-confident calibration curve.

Probabilistic predictions of an overfitting polynomial classifier and the resulting overconfident calibration curve on the same noisy XOR problem.

Simulation study to show the relative lack of stability of hyperparameter tuning when using hard metrics such as Accuracy or soft yet not probabilistic metrics such as ROC AUC compared to a strictly proper scoring rule such as the log-loss.

November 27, 2024 at 2:17 PM

Reposted by Vignesh Padmanabhan

Camille Troillard

@tuscland.bsky.social

Bringing structure and recommended practices to Machine Learning projects can be challenging. Even experienced data scientists struggle with it.

That's why we built skore – your companion when modeling with scikit-learn. Check it out and let us know what you think!

github.com/probabl-ai/s...

GitHub - probabl-ai/skore: Your scikit-learn Modeling Companion

Your scikit-learn Modeling Companion. Contribute to probabl-ai/skore development by creating an account on GitHub.

github.com

December 13, 2024 at 9:30 AM

Vignesh Padmanabhan

@ai-slayer.bsky.social

Which setup would you choose for running large language models (LLMs) locally ?

Option 1:
• Apple M4 Max
• 14-core CPU, 32-core GPU
• 36 GB unified memory
• 1 TB SSD

Option 2:
• Apple M4 Pro
• 14-core CPU, 20-core GPU
• 48 GB unified memory
• 1 TB SSD

November 25, 2024 at 6:31 PM

Reposted by Vignesh Padmanabhan

P. David García 🏳️‍🌈

@soerenmuenchen.bsky.social

Everything you always wanted to ask about entropy but didn't know whom by John Baez.

arxiv.org/abs/2409.09232

What is Entropy?

This short book is an elementary course on entropy, leading up to a calculation of the entropy of hydrogen gas at standard temperature and pressure. Topics covered include information, Shannon entropy...

arxiv.org

November 24, 2024 at 6:48 AM

Vignesh Padmanabhan

@ai-slayer.bsky.social

@bsky.app The translate takes us to a tab and does the conversion. Is there an update which makes it English in place?

November 23, 2024 at 9:39 AM

Vignesh Padmanabhan

@ai-slayer.bsky.social

Love the Starter Pack by @bsky.app .. brilliant idea! Quickly finds all your X & Threads follows in one go!

November 23, 2024 at 9:34 AM

Reposted by Vignesh Padmanabhan

Skrub

@skrub-data.bsky.social

We're always updating the pydata & scipy project starter pack:
go.bsky.app/6HkrMcp

Hello @scikit-learn.bsky.social , @networkx.bsky.social , @scipyconf.bsky.social

November 22, 2024 at 5:46 PM

Reposted by Vignesh Padmanabhan

vmoens

@vmoens.bsky.social

One of my fav projects: LeanRL, a simple RL library that provides recipes for fast RL training using torch.compile and cudagraphs.
Using these, we got >6x speed-ups compared to the original CleanRL implementations.
github.com/pytorch-labs...

November 22, 2024 at 6:38 AM

Reposted by Vignesh Padmanabhan

Anshul Kundaje

@anshulkundaje.bsky.social

www.anthropic.com/research/sta...

This is an excellent attempt (blog & paper) at bringing more statistical rigor to evaluation of ML models (this is specifically focused on LLM evals).

I feel like we need to have similar clear standards for many types of predictive models in biology. 1/

A statistical approach to model evaluations

A research paper from Anthropic on how to apply statistics to improve language model evaluations

www.anthropic.com

November 22, 2024 at 8:29 AM

Reposted by Vignesh Padmanabhan

Sebastian Raschka (rasbt)

@sebastianraschka.com

Just put together a list of papers to highlight 4 interesting things about transformers & LLMs.

Including a discussion on why the original transformer architecture figure is wrong, and a related approach published in 1991!

https://magazine.sebastianraschka.com/p/why-the-original-transformer-figure

Why the Original Transformer Figure Is Wrong, and Some Other Interesting Historical Tidbits About LLMs

A few months ago, I shared the article, Understanding Large Language Models: A Cross-Section of the Most Relevant Literature To Get Up to Speed, and the positive feedback was very motivating! So, I also added a few papers here and there to keep the list fresh and relevant.

magazine.sebastianraschka.com

May 25, 2023 at 4:12 PM

Reposted by Vignesh Padmanabhan

Sebastian Raschka (rasbt)

@sebastianraschka.com

The Llama 3.2 1B and 3B models are my favorite LLMs -- small but very capable.
If you want to understand how the architectures look like under the hood, I implemented them from scratch (one of the best ways to learn): github.com/rasbt/LLMs-f...

November 20, 2024 at 8:33 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news