Tivadar Danka
banner
tivadardanka.bsky.social
Tivadar Danka
@tivadardanka.bsky.social
I make math accessible for everyone. Mathematician with an INTJ personality. Chaotic good. Writing https://thepalindrome.org
Computationally speaking, decomposing models into graphs is the idea that fuels backpropagation.

My neural networks series is dedicated to explaining exactly what's going on in the picture above.

Here's everything you need to know: thepalindrome.org/p/introduct...
Introduction to Computational Graphs
Neural Networks From Scratch, Part I
thepalindrome.org
November 24, 2025 at 1:33 PM
If you liked this post, you will love The Palindrome, my weekly newsletter on Mathematics and Machine Learning.

Join 35,000+ curious readers here: thepalindrome.org/
The Palindrome | Tivadar Danka | Substack
mathematics ∪ machine learning. Click to read The Palindrome, a Substack publication with tens of thousands of subscribers.
thepalindrome.org
November 23, 2025 at 1:32 PM
Is the Bayesian viewpoint better than the frequentist one?

No. It's just different. In certain situations, frequentist estimations are perfectly enough. In others, Bayesian methods have the advantage. Use the right tool for the task, and don't worry about the rest.
November 23, 2025 at 1:32 PM
To sum up: as a mathematical concept, probability is independent of interpretation. The question of frequentist vs. Bayesian comes up when we are building probabilistic models from data.
November 23, 2025 at 1:32 PM
After this, we get a concrete formula for the posterior density.

(The symbol ∝ reads as “proportional to”, and we write this instead of equality because of the omitted denominator.)
November 23, 2025 at 1:32 PM
Back to our coin-tossing example. Given the probability of heads, the likelihood can be computed using simple combinatorics.
November 23, 2025 at 1:32 PM
Bad news: the evidence can be impossible to evaluate. Good news: we don’t have to! We find the parameter estimate by maximizing the posterior, and as the evidence doesn’t depend on the parameter at all, we can simply omit it.
November 23, 2025 at 1:32 PM
In pure English,

• the likelihood describes the probability of the observation given the model parameter,
• the prior describes our assumptions about the parameter before the observation,
• and the evidence is the total probability of our observation.
November 23, 2025 at 1:32 PM
Don't worry if this seems complex! We'll unravel it term by term.

There are three terms on the right side: the likelihood, the prior, and the evidence.
November 23, 2025 at 1:32 PM
The Bayes formula connects the prior and the likelihood to the posterior.
November 23, 2025 at 1:32 PM
What we want is to include the experimental observations in our estimation, which is expressed in terms of conditional probabilities.

This is called posterior estimation.
November 23, 2025 at 1:32 PM
Our prior assumption about the probability is called, well, the prior.

For instance, if we know absolutely nothing about our coin, we assume this to be uniform.
November 23, 2025 at 1:32 PM
In Bayesian statistics, we treat our probability-to-be-estimated as a random variable. Thus, we are working with probability distributions or densities.

Yes, I know. The probability of probability. It’s kind of an Inception-moment, but you’ll get used to it.
November 23, 2025 at 1:32 PM
Let's stick to our coin-tossing example to show how this works in practice. Regardless of the actual probabilities, 90 heads from 100 tosses is a possible outcome in (almost) every case.

Is the coin biased, or were we just lucky? How can we tell?
November 23, 2025 at 1:32 PM
Conditional probabilities allow us to update our probabilistic model in light of new information. This is called the Bayes formula, hence the terminology "Bayesian statistics".

Again, this is a mathematically provable fact, not an interpretation.
November 23, 2025 at 1:32 PM
With conditional probabilities, we can quantify our intuition about the relation of rain and the clouds in the sky.
November 23, 2025 at 1:32 PM
In probabilistic models, observing certain events can influence our beliefs about others. For instance, if the sky is clear, the probability of rain goes down. If it’s cloudy, the same probability goes up.

This is expressed in terms of conditional probabilities.
November 23, 2025 at 1:32 PM
On the other hand, the Bayesian school argues that such estimations are wrong, because probabilities are not absolute, but a measure of our current beliefs.

This is way too abstract, so let's elaborate.
November 23, 2025 at 1:32 PM
Frequentists leverage this to build probabilistic models. For example, if we toss a coin n times and heads come up exactly k times, then the probability of heads is estimated to be k/n.
November 23, 2025 at 1:32 PM
As the number of observations grows, the relative frequency will converge to the true probability.

This is not an interpretation of probability. This is a mathematically provable fact, independent of interpretations. (A special case of the famous Law of Large Numbers.)
November 23, 2025 at 1:32 PM
Suppose that we repeatedly perform a single experiment, counting the number of occurrences of the possible events. Say, we are tossing a coin and count the number of times it turns up heads.

The ratio of the heads and the tosses is called “the relative frequency of heads”.
November 23, 2025 at 1:32 PM
Now comes the part that has been fueling debates for decades.

How can we assign probabilities? There are (at least) two schools of thought, constantly in conflict with each other.

Let's start with the frequentist school.
November 23, 2025 at 1:32 PM
Note that at this point, there is no frequentist or a Bayesian interpretation yet!

Probability is a well-defined mathematical object. This concept is separated from how probabilities are assigned.
November 23, 2025 at 1:32 PM
2. Throwing darts. Suppose that we are throwing darts at a large wall in front of us, which is our event space. (We'll always hit the wall.)

If we throw the dart randomly, the probability of hitting a certain shape is proportional to the shape's area.
November 23, 2025 at 1:32 PM