Samuel Müller
banner
sammuller.bsky.social
Samuel Müller
@sammuller.bsky.social
(Tab)PFNs, TrivialAugment etc.
Check out our position paper and come to our ICML poster (Thursday 4:30 PM, East Exhibition Hall A-B E-606).

arxiv.org/abs/2505.23947 n/n
Position: The Future of Bayesian Prediction Is Prior-Fitted
Training neural networks on randomly generated artificial datasets yields Bayesian models that capture the prior defined by the dataset-generating distribution. Prior-data Fitted Networks (PFNs) are a...
arxiv.org
July 8, 2025 at 8:03 PM
There are already early examples of this, that we discuss, in areas as diverse as biology, Bayesian optimization, time-series forecasting, and tabular data. The most prominent being TabPFN (Nature '25). 5/n

news.ycombinator.com/item?id=4264...
Show HN: TabPFN v2 – A SOTA foundation model for small tabular data | Hacker News
news.ycombinator.com
July 8, 2025 at 8:03 PM
We go into detailed comparisons to other Bayesian methods and the trade-offs that lead us to the conclusion, that PFNs will become dominant for Bayesian prediction, and further that Bayesian prediction will become more important overall with better priors. 4/n
July 8, 2025 at 8:03 PM
What's nice is that the model after training on this random data, will start to make sense of real-world data, too. It will approximate the posterior belonging to the prior of choice, e.g., a BNN, a GP, or in the most interesting cases a Bayesian model that doesn't exist yet. 3/n
July 8, 2025 at 8:03 PM
Prior-data fitted networks (PFNs) do just that!

The PFN idea is to use a prior, e.g. a bayesian neural network (BNN) prior, sample datasets from that prior, and then train to predict the hold-out labels of these datasets. (no training on real-world data) 2/n
July 8, 2025 at 8:03 PM
To then change it? In like „overhaul“?
April 15, 2025 at 1:57 PM
Find my full write up (including scenarios with bad actors, as well as the prompts used) plus the game here: github.com/SamuelGabrie...
If you think, my single person experiment is not to be trusted? You are right, try it yourself!
GitHub - SamuelGabriel/LMARENA-GAMING
Contribute to SamuelGabriel/LMARENA-GAMING development by creating an account on GitHub.
github.com
February 24, 2025 at 1:17 PM
In combination with the large employee numbers at top AI labs and small numbers of votes on lmarena lead me to the conclusion that lmarena scores are probably dominated by biased votes.
February 24, 2025 at 1:17 PM
In hard mode I attributed 13/20 completely correctly, much higher than the expected 3.3 of random guessing.
That is I could identify all 3 models correctly in 13/20 cases after practicing with 20 questions.
That means attributing responses to LLMs is super easy for humans.
February 24, 2025 at 1:17 PM
I first played easy mode (see below), where I got two answers from each model and need to match them.
I used 20 interactions in the easy mode to learn the models' behaviors.
In hard mode (see prev post), you need to match three responses to the LLM name.
February 24, 2025 at 1:17 PM
Second, employees are very likely able to tell models apart based on their gut feeling.
To figure out if this is the case, I created a game with two modes.
The game is about identifying which answer was provided by which LLM.
February 24, 2025 at 1:17 PM
First, AI labs have enough employees to bias the benchmarks.
E.g. Grok 3 only has 10K votes and there are 2.7M votes in total on lmarena.
If half of e.g. OpenAI (2,000 employees) voted just once a day, they would make up > 10% of all 2.7M lmarena votes over its one-year existence.
February 24, 2025 at 1:17 PM
seems to beat boosting there, too, but prob a bit early to make definitive statements
February 7, 2025 at 8:43 AM
What did you think was interesting? The interview had such bad timing, a few days before the r1 launch
January 26, 2025 at 7:12 AM
We have an r implementation under development currently. See here github.com/robintibor/R...
GitHub - robintibor/R-tabpfn
Contribute to robintibor/R-tabpfn development by creating an account on GitHub.
github.com
January 11, 2025 at 2:10 PM
Thank you :) So far, we only open source the model itself and how to use it. We do not open source how to train it exactly, sorry for that :| there is a company starting based on the model, thus it is kinda its mode
January 9, 2025 at 11:53 AM