Ben Trent
banner
benwtrent.bsky.social
Ben Trent
@benwtrent.bsky.social
Doer of things | Builder of things | software engineer
@elastic
Lucene will now intelligently merge HNSW graphs: elastic.co/search-labs/... Now indexing and merging is much cheaper, reducing the compute required and improving indexing throughput
April 8, 2025 at 12:57 PM
Indexing and merging times are getting better for #Apache #Lucene vector search. Lucene has a read-only segment architecture. One of the drawbacks of this approach is throwing away previously completed work when merging HNSW graphs. Well, this got better :)
April 8, 2025 at 12:57 PM
This this new algorithm, we have seen 3-5x fewer vector operations to achieve the same recall on previously horribly performing filter percentages.
February 28, 2025 at 3:39 PM
We have implemented a variation of the ACORN-1. arxiv.org/abs/2403.04871 The key idea is expanding your HNSW neighborhood search, and only score candidates matching your filter criteria.
February 28, 2025 at 3:39 PM
Filtered vector search is crazy important. So we made HNSW filtered search in Apache Lucene better. At similar recall, it can be 3-5x faster!
February 28, 2025 at 3:39 PM
The number of improvements in Lucene here are crazy. Pretty much every count and boolean query gets a nice boost and some of the count improvements are hilarious 🚀🚀🚀.
January 15, 2025 at 6:28 PM
But wait, you want push all those lower level retrievers combined with RRF through a cross-encoder for semantic reranking? Well, here you go: [🧵 cont.]
November 14, 2024 at 4:24 PM
Want to combine any number of different query signals via RRF? We got you covered: [🧵 cont.]
November 14, 2024 at 4:24 PM