Ben
@benjdd.com
250 followers 55 following 220 posts
databases @planetscale.com
Posts Media Videos Starter Packs
Pinned
Ben @benjdd.com · Mar 13
I implemented an ssd, hdd, and tape device.

In javascript.

For a blog.
He's back on the butterfly.
My half-finished "Transactions" article has been sitting dormant for awhile.

Time to bring it back?

Started this long before PlanetScale Postgres. Would be a fun opportunity to compare the MVCC models of Postgres and MySQL.
The two generals problem is actually proven to be UNSOLVABLE. Silly as it may seem, the challenge is similar for servers coordinating over TCP in a data center.

This is one of the main jobs of a distributed systems engineer: designing reliable systems in the face of unreliability.
Generals A and B are in separate camps and need to coordinate a time of attack. If A sends the message "attack at dawn!" to B and never gets a reply, A doesn't know what happened! The messenger may not have made it to B, or B received the message and the messenger was captured on the return journey.
Any component can fail at any time in distributed systems.

How do you communicate in the face of unreliability? It's not easy!

This is classically known as the "two generals problem" and is a great way to think about communicating over unreliable channels.
Biggest surprise from my Postgres 17 vs 18 benchmarks?

io_uring was frequently the worst performer!

It was slower than both `sync` and `worker` in many cases. Tomas Vondra has a wonderful article explaining why it isn't always the best choice.

vondra.me/posts/tuning...
I benchmarked 96 combinations of Postgres 17 and 18.

There's a few surprising results, but overall Postgres 18 has some nice improvements.

Read all about it at the link below.
Which to use? Hate to say it but... depends on context, and sometimes should be used alongside tools like pg_dump.
Logical replication works at the row-level abstraction and works across all major Pg versions 10+.

Though it's more flexible, it isn't a silver bullet for migrations. Logical replication does not replicate DDLs or sequences and can be slow and lead to performance issues for large databases.
This makes it useful for HA primary-replica setups, but it's unrealistic for version upgrades and migration to different platforms.
Physical replication ships the full WAL to replicas, producing byte-level copies of the primary node. This means CPU overhead is low on the primary, but it's ONLY compatible between the same Pg versions (+ some extension and glibc versions must match) and there's no cross-architecture replication!
Two ways to replicate in Postgres: Physical and Logical.

Both great features with different tradeoffs.
Ben @benjdd.com · 10d
When we need the full count for a video, sum all counts for rows with a specific v_id. This can be done ad-hoc or on a regular interval and stored in the video table.
Ben @benjdd.com · 10d
The fix? Spread the counting out across N rows in a dedicated counter table. In the example below N=3, but you can make N=100 or N=1000 depending on the load.

Every increment chooses one of the N rows randomly to increment.
Ben @benjdd.com · 10d
Consider a website like YouTube, where we need to track view counts for every video. It's convenient to store this as a column in our video table, but for popular videos there will be lots of queries simultaneously attempting increments on the same row!

Contention == locking == slow!
Ben @benjdd.com · 10d
Have a row contention problem? Try slotted counters!

This is a neat technique for spreading out increments across many rows to reduce contention.
Ben @benjdd.com · 11d
I love benchmarks like this.

Visualizing performance improvements over time, validating that Postgres just keeps getting better (mostly).

For all its flaws, it's great to see that the core team is dedicated to continual improvement.
Ben @benjdd.com · 12d
It's just their design choice! Postgres favors this design for the sake of MVCC and fast linear writes at the cost of having space reclamation happen as separate jobs / processes.

MySQL does things very different with clustered table indexes and the undo log.
Ben @benjdd.com · 12d
Relying solely on autovacuum can lead to fragmented data layout. It can still be useful to do semi-regular table rebuilds either with VACUUM FULL or with extensions like pg_squeeze and pg_repack.

It's also needed to avoid xid wraparound and for updating table stats used by the query planner.
Ben @benjdd.com · 12d
Autovacuum is a background job that scans tables to determine which rows have been removed and reclaims this space. It doesn't rearrange or compact data, but makes the slots available for re-use by future inserts and updates.
Ben @benjdd.com · 12d
Why does Postgres need autovacuum?

By default (and with all vacuuming disabled) Postgres never removes rows. When data is deleted, it marks the rows as dead by setting the max transaction id (xmax) but does not reclaim the space. Without intervention, tables would grow indefinitely!