JF Puget
JF Puget
@jfpuget.bsky.social
Competitive Machine Learning director at NVIDIA, 3x Kaggle Grandmaster CPMP, ENS ULM alumni. Kaggle profile: https://www.kaggle.com/cpmpml
I always thought that reasoning does not require language. Well, this seems to be supported by neuroscience, see screenshot from

arxiv.org/pdf/2412.06769
March 6, 2025 at 9:12 AM
Reposted by JF Puget
Chris Deotte is a Senior Data Scientist and
@jfpuget.bsky.social is the Director and a Distinguished Engineer at Nvidia. They join @seanfalconer.bsky.social to talk about NVIDIA RAPIDS and GPU-acceleration for data science tools.

softwareengineeringdaily.com/2025/03/04/n...
NVIDIA RAPIDS and Open Source ML Acceleration with Chris Deotte and Jean-Francois Puget - Software Engineering Daily
NVIDIA RAPIDS is an open-source suite of GPU-accelerated data science and AI libraries. It leverages CUDA and significantly enhances the performance of core Python frameworks including Polars, pandas,...
softwareengineeringdaily.com
March 4, 2025 at 2:06 PM
I have been working with R1 distilled models lately for some agentic workflows (workflows where the output of LLM is used to decide what to do next). Prompting is different from previous models like Llama, but the bulk of the change is to parse the output to extract what you are interested in. 1/n
March 4, 2025 at 8:57 AM
I looked at AIME problems and one thing strikes me. All problems are about computing a number. This is a tiny part of math.

AIME problems olympiads.us/past-exams/2...

thread:
AIME I
February 6th, 2025 | The first American Invitational Mathematics Examination of the year. Students tackle 15 challenging problems in three hours.
olympiads.us
February 8, 2025 at 11:35 AM
I asked R1 (full model, locally hosted) to solve this logic puzzle.

Which answer in this list is the correct answer to this question?
All of the below.
None of the below.
All of the above.
One of the above.
None of the above.
None of the above

It solves it correctly.
January 30, 2025 at 6:24 PM
How to make ChatGPT speak like Adolf Hitler.

This is not a criticism of ChatGPT 4o nor OpenAi work. I do think it is important to be able to teach people about bad things that happened.

With that in mind, here is the thing: chatgpt.com/share/6794fa...
January 25, 2025 at 6:41 PM
Interested in KV Cache compression? Have a look at my team's KV Press.

You can start from HuggingFace blog: huggingface.co/blog/nvidia/...
January 25, 2025 at 2:21 PM
My take from Deepseek R1 paper. It was trained on reasoning tasks where the outcome can be assessed without ambiguity (correct math response, and code that compile and produces the right output)

To me it is like SFT with perfect ground truth.

There are other key findings from that team ofc.
January 24, 2025 at 1:39 PM
Some European media are less ambiguous than that. Cant say for US media.

An American friend didn't know about this till I told him. It did not show in his news feed (provided by Google). This is even worse IMHO. Just to consider this is business as usual.
Nazis: "that's a nazi salute"

Historians: "that's a nazi salute"

Average person: "that's a nazi salute"

The Media: "Elon Musk makes odd gesture throwing his heart to the crowd."
January 21, 2025 at 4:35 PM
Reposted by JF Puget
Nazis: "that's a nazi salute"

Historians: "that's a nazi salute"

Average person: "that's a nazi salute"

The Media: "Elon Musk makes odd gesture throwing his heart to the crowd."
January 21, 2025 at 1:03 AM


Who's surprised?

When will people get that this happens? And even if not shared intentionally, as soon as you call an OAI api, OAI has access to what you send it.

OAI is not special here, any LLM api provider does the same.

Unless you have a private instance of it.
January 19, 2025 at 11:51 AM
Reposted by JF Puget
Just sought to replicate this and it’s like halfway fixed but still wrong🙄
January 17, 2025 at 1:55 PM
My take on what's going at OpenAI. I think they have reached a point where o3 or whatever they call it is self improving autonomously.

Does it mean it is AGI or ASI? Certainly not.

AlphaGo was self improving for instance. It is not an AGI either.
January 17, 2025 at 12:52 PM
NVIDIA’s Academic Grant Program is accepting proposals to accelerate data processing, graph analytics, graph neural networks, operational research, route optimization, and predictive modeling for scientific research using NVIDIA technology.

Deadline to apply is March 31: nvda.ws/3ZNxzuW

1/2
January 13, 2025 at 5:15 PM
Reposted by JF Puget
Facebook is censoring 404 Media stories about Facebook's censorship

🔗 www.404media.co/facebook-is-...
January 8, 2025 at 4:03 PM
Reposted by JF Puget
I believe Nvidia is releasing DIGITS to accelerate Grace CPU adoption. It is a very smart move by Nvidia.
NVIDIA Project DIGITS With New GB10 Superchip Debuts as World’s Smallest AI Supercomputer Capable of Running 200B-Parameter Models

nvidianews.nvidia.com/news/nvidia-...
January 8, 2025 at 5:45 AM
Reposted by JF Puget
The European Fact-Checking Standards Network responds to Meta slashing fact-checking:

“Fact-checking is not censorship, far from that, fact-checking adds speech to public debates, it provides context and facts for every citizen to make up their own mind”

Full statement ⬇️

efcsn.com/news/2025-01...
EFCSN disappointed by end to Meta’s Third Party Fact-Checking Program in the US; Condemns statements linking fact-checking to censorship – European Fact-Checking Standards Network (EFCSN)
7 January 2025 – The European Fact-Checking Standards Network (EFCSN) is disappointed by  Meta’s decision to end its Third Party...
efcsn.com
January 7, 2025 at 8:29 PM
So, we moved from semi sentient LLMs to singularity LLMs...

This without any definition nor hint about how the claim could be checked independently.

I predict that we'll have many of these throughout next 10 years. I say 10 but it could be way more.
January 5, 2025 at 11:58 PM
Bonne annee! Happy new year!

I hope it will be better than 2024 for the planet.
January 1, 2025 at 12:44 PM
One thing not discussed much regarding o3 results on @arcprize : the semi private test set has been available to anyone using llm apis for a while. For instance the guy who got a high score by generating code with GTP 4o. Using their api leaks the data to the llm api providers.
1/2
December 26, 2024 at 1:24 PM
Reposted by JF Puget
Some of my thoughts on OpenAI's o3 and the ARC-AGI benchmark

aiguide.substack.com/p/did-openai...
Did OpenAI Just Solve Abstract Reasoning?
OpenAI’s o3 model aces the "Abstraction and Reasoning Corpus" — but what does it mean?
aiguide.substack.com
December 23, 2024 at 2:38 PM
I should not laugh at this being a NVIDIA employee. But I did.
December 24, 2024 at 2:23 PM
I am expressing some doubts about how optimistic o3 results are, but don't get me wrong. I do think o3 is integrating something (some tree search if you ask me) that makes it solve tasks that requires some reasoning for humans. This is significant progress over previous systems.
December 22, 2024 at 8:12 PM
Reposted by JF Puget
I'll get straight to the point.

We trained 2 new models. Like BERT, but modern. ModernBERT.

Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.

It's much faster, more accurate, longer context, and more useful. 🧵
December 19, 2024 at 4:45 PM
I am amazed by the number of people who attribute MCTS invention to Google DeepMind AlphaZero.

MCTS was invented by Remi Coulom in 2006. UCT was invented at about the same time.

Coulom's 2006 paper: www.remi-coulom.fr/CG2006/CG200...

How can people be so ignorant?
www.remi-coulom.fr
December 18, 2024 at 2:25 PM