banner
tempest98.bsky.social
@tempest98.bsky.social
Working on a cache for PostgreSQL: https://github.com/tempest98/pgcache
I've already fixed a couple of bugs where the cache was behaving in unexpected ways and the failover to origin was masking the problems.

I'm really excited to have much more robust tests going forward :-)
November 13, 2025 at 2:47 AM
Now with the metrics added, counts for cache hits and misses can be checked at the end of the test and the test will fail if the counts are not as expected.

I've added checks to a couple of tests so far, and I'd love to say no bugs were found.

On the other hand, the metrics are working as expected
November 13, 2025 at 2:47 AM
The way the cache is designed, if there are any errors trying to process a cache hit then the cache fails over by forwarding the query to origin. Or if the tests expects there to be a cache hit and it is a miss, there is no way to tell just by looking at the query results.
November 13, 2025 at 2:47 AM
And the most recent work has been to added basic support for the extended query protocol. This ended up being tricky as there are multiple phases of the protocol that have to be coordinated between the cache and origin.
The current code is basic and can be improved, but it works :-)
November 5, 2025 at 10:29 PM
That set of changes eliminated almost the entire overhead of the proxy. To eliminate the rest I'm going to have to write my own postgres network protocol handling instead of relying on rust-postgres.
November 5, 2025 at 10:29 PM
After that, I did some performance optimization. Main culprit was the naive code I initially wrote would write data back to the client for every row, which got expensive handling hundreds of thousands of rows. Now the code accumulates rows into a 64kB buffer and writes that when it is full.
November 5, 2025 at 10:29 PM
Next big piece of work was adding a resolution step to handle table and column aliases. The output is a ResolvedAST structure where all the naming is explicit. This enabled adding constant and constraint propagation to make cache invalidation more precise.
November 5, 2025 at 10:29 PM
Trying to embrace type-driven design, I dded a CacheableQuery struct for, well, queries that can be cached. It helped centralize the logic for deciding what can be cached and allowed a bunch of code to be simplified. I'm generally happy with the outcome.
November 5, 2025 at 10:29 PM
I also moved the creation of the cache thread into the proxy thread, which enables the proxy thread to recreate the cache if a failure is detected.
November 5, 2025 at 10:29 PM
Did a bunch of refactoring and cleanup after that, including working on tests so that I can run multiple integration tests in parallel. Key to that was randomizing port assignments so that the multiple tests wouldn't conflict with each other.

The pgtemp crate has been super useful for this.
November 5, 2025 at 10:29 PM
Oh hey, I actually finished the article. PG 18 came out while I was working on the article, so I retested with 18. The #postgres folks added another optimization from the original article, support for removing unneeded self joins.

medium.com/@tempest_619...
PostgreSQL query optimizations
I came across this article a while ago that looks at a number of query optimizations databases can make to optimize queries before…
medium.com
November 5, 2025 at 9:39 PM
After accepting that fact, I got back to working, but haven't been posting about it. So now I have a bunch of posting to catch up on.
October 18, 2025 at 5:49 PM
Another one is being able to rename columns when using a table alias, e.g "... FROM users AS blah(foo, bar) ..." will rename the first two columns of table users.
I've never used and don't recall having come across it before.
August 28, 2025 at 12:54 AM
I've done this 3 or 4 times, and it has been nice. Previously, I might have done all the work in a single commit or put off the refactoring until the main work was done, or forever :-)

The simplicity of #jujutsu makes it easy to try out different workflows.
August 13, 2025 at 11:44 PM
The workflow I'm using is whenever I decide to refactor something is to create a revision before the main one with 𝚓𝚓 𝚗𝚎𝚠 –𝙱 @ possibly squash some work I've already done into the revision and then complete the changes. After returning to the main revision, there may be some conflicts to resolve.
August 13, 2025 at 11:44 PM
What I'm enjoying is how #jujutsu makes it easy to keep the main revision focused on the join support and to create separate revision for refactors that are useful but not directly related to the main revision.
August 13, 2025 at 11:44 PM