Lightnews — Scholar-powered news

Ben Trent

@benwtrent.bsky.social

140 followers 65 following 39 posts

Doer of things | Builder of things | software engineer
@elastic

Posts Replies Media Videos

Ben Trent

@benwtrent.bsky.social

Lucene will now intelligently merge HNSW graphs: elastic.co/search-labs/... Now indexing and merging is much cheaper, reducing the compute required and improving indexing throughput

April 8, 2025 at 12:57 PM

Ben Trent

@benwtrent.bsky.social

Indexing and merging times are getting better for #Apache #Lucene vector search. Lucene has a read-only segment architecture. One of the drawbacks of this approach is throwing away previously completed work when merging HNSW graphs. Well, this got better :)

April 8, 2025 at 12:57 PM

Ben Trent

@benwtrent.bsky.social

This this new algorithm, we have seen 3-5x fewer vector operations to achieve the same recall on previously horribly performing filter percentages.

February 28, 2025 at 3:39 PM

Ben Trent

@benwtrent.bsky.social

We have implemented a variation of the ACORN-1. arxiv.org/abs/2403.04871 The key idea is expanding your HNSW neighborhood search, and only score candidates matching your filter criteria.

February 28, 2025 at 3:39 PM

Ben Trent

@benwtrent.bsky.social

Filtered vector search is crazy important. So we made HNSW filtered search in Apache Lucene better. At similar recall, it can be 3-5x faster!

February 28, 2025 at 3:39 PM

Ben Trent

@benwtrent.bsky.social

The number of improvements in Lucene here are crazy. Pretty much every count and boolean query gets a nice boost and some of the count improvements are hilarious 🚀🚀🚀.

200 more queries per second for counting docs with two highly occurring terms

almost 2x better queries per second for disjunctions over highly occurring terms

almost 3x faster count disjunctions when considering many different terms.

January 15, 2025 at 6:28 PM

Ben Trent

@benwtrent.bsky.social

But wait, you want push all those lower level retrievers combined with RRF through a cross-encoder for semantic reranking? Well, here you go: [🧵 cont.]

November 14, 2024 at 4:24 PM

Ben Trent

@benwtrent.bsky.social

Want to combine any number of different query signals via RRF? We got you covered: [🧵 cont.]

November 14, 2024 at 4:24 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news