Lightnews — Scholar-powered news

tempest98.bsky.social

@tempest98.bsky.social

I've already fixed a couple of bugs where the cache was behaving in unexpected ways and the failover to origin was masking the problems.

I'm really excited to have much more robust tests going forward :-)

November 13, 2025 at 2:47 AM

tempest98.bsky.social

@tempest98.bsky.social

Now with the metrics added, counts for cache hits and misses can be checked at the end of the test and the test will fail if the counts are not as expected.

I've added checks to a couple of tests so far, and I'd love to say no bugs were found.

On the other hand, the metrics are working as expected

November 13, 2025 at 2:47 AM

tempest98.bsky.social

@tempest98.bsky.social

The way the cache is designed, if there are any errors trying to process a cache hit then the cache fails over by forwarding the query to origin. Or if the tests expects there to be a cache hit and it is a miss, there is no way to tell just by looking at the query results.

November 13, 2025 at 2:47 AM

tempest98.bsky.social

@tempest98.bsky.social

And the most recent work has been to added basic support for the extended query protocol. This ended up being tricky as there are multiple phases of the protocol that have to be coordinated between the cache and origin.
The current code is basic and can be improved, but it works :-)

November 5, 2025 at 10:29 PM

tempest98.bsky.social

@tempest98.bsky.social

That set of changes eliminated almost the entire overhead of the proxy. To eliminate the rest I'm going to have to write my own postgres network protocol handling instead of relying on rust-postgres.

November 5, 2025 at 10:29 PM

tempest98.bsky.social

@tempest98.bsky.social

After that, I did some performance optimization. Main culprit was the naive code I initially wrote would write data back to the client for every row, which got expensive handling hundreds of thousands of rows. Now the code accumulates rows into a 64kB buffer and writes that when it is full.

November 5, 2025 at 10:29 PM

tempest98.bsky.social

@tempest98.bsky.social

Next big piece of work was adding a resolution step to handle table and column aliases. The output is a ResolvedAST structure where all the naming is explicit. This enabled adding constant and constraint propagation to make cache invalidation more precise.

November 5, 2025 at 10:29 PM

tempest98.bsky.social

@tempest98.bsky.social

Trying to embrace type-driven design, I dded a CacheableQuery struct for, well, queries that can be cached. It helped centralize the logic for deciding what can be cached and allowed a bunch of code to be simplified. I'm generally happy with the outcome.

November 5, 2025 at 10:29 PM

tempest98.bsky.social

@tempest98.bsky.social

I also moved the creation of the cache thread into the proxy thread, which enables the proxy thread to recreate the cache if a failure is detected.

November 5, 2025 at 10:29 PM

tempest98.bsky.social

@tempest98.bsky.social

Did a bunch of refactoring and cleanup after that, including working on tests so that I can run multiple integration tests in parallel. Key to that was randomizing port assignments so that the multiple tests wouldn't conflict with each other.

The pgtemp crate has been super useful for this.

November 5, 2025 at 10:29 PM

tempest98.bsky.social

@tempest98.bsky.social

Oh hey, I actually finished the article. PG 18 came out while I was working on the article, so I retested with 18. The #postgres folks added another optimization from the original article, support for removing unneeded self joins.

medium.com/@tempest_619...

PostgreSQL query optimizations

I came across this article a while ago that looks at a number of query optimizations databases can make to optimize queries before…

medium.com

November 5, 2025 at 9:39 PM

tempest98.bsky.social

@tempest98.bsky.social

After accepting that fact, I got back to working, but haven't been posting about it. So now I have a bunch of posting to catch up on.

October 18, 2025 at 5:49 PM

tempest98.bsky.social

@tempest98.bsky.social

Another one is being able to rename columns when using a table alias, e.g "... FROM users AS blah(foo, bar) ..." will rename the first two columns of table users.
I've never used and don't recall having come across it before.

August 28, 2025 at 12:54 AM

tempest98.bsky.social

@tempest98.bsky.social

I've done this 3 or 4 times, and it has been nice. Previously, I might have done all the work in a single commit or put off the refactoring until the main work was done, or forever :-)

The simplicity of #jujutsu makes it easy to try out different workflows.

August 13, 2025 at 11:44 PM

tempest98.bsky.social

@tempest98.bsky.social

The workflow I'm using is whenever I decide to refactor something is to create a revision before the main one with 𝚓𝚓 𝚗𝚎𝚠 –𝙱 @ possibly squash some work I've already done into the revision and then complete the changes. After returning to the main revision, there may be some conflicts to resolve.

August 13, 2025 at 11:44 PM

tempest98.bsky.social

@tempest98.bsky.social

What I'm enjoying is how #jujutsu makes it easy to keep the main revision focused on the join support and to create separate revision for refactors that are useful but not directly related to the main revision.

August 13, 2025 at 11:44 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news