Lightnews — Scholar-powered news

Andrew Drozdov

@mrdrozdov.com

5.2K followers 610 following 180 posts

Research Scientist @ Mosaic x Databricks. Adaptive Methods for Retrieval, Generation, NLP, AI, LLMs https://mrdrozdov.github.io/

Posts Replies Media Videos

Andrew Drozdov

@mrdrozdov.com

It was a real pleasure talking about effective IR approaches with Brooke and Denny on the Data Brew podcast.

Among other things, I'm excited about embedding finetuning and reranking as modular ways to improve RAG pipelines. Everyone should use these more!

February 26, 2025 at 12:53 AM

Andrew Drozdov

@mrdrozdov.com

"All you need to build a strong reasoning model is the right data mix."

The pipeline that creates the data mix:

January 26, 2025 at 11:30 PM

Andrew Drozdov

@mrdrozdov.com

Using 100+ tokens to answer 2 + 3 =

January 22, 2025 at 5:42 PM

Andrew Drozdov

@mrdrozdov.com

Slides are up! I presented on "Presentation & Consumption in the context of REML"

The full deck is here. There's a lot of gems if you're interested in this space!

retrieval-enhanced-ml.github.io/sigir-ap2024...

December 9, 2024 at 7:14 AM

Andrew Drozdov

@mrdrozdov.com

Today we'll be presenting the Tutorial on Retrieval-Enhanced Machine Learning (REML). Come by to learn about the emerging design patterns in this space and see how to use retrieval beyond RAG.

In collaboration w/ the amazing @841io.bsky.social @teknology.bsky.social Alireza Salemi and Hamed Zamani.

December 9, 2024 at 1:41 AM

Andrew Drozdov

@mrdrozdov.com

Seen in NYC

December 6, 2024 at 1:04 PM

Andrew Drozdov

@mrdrozdov.com

December 4, 2024 at 2:11 PM

Andrew Drozdov

@mrdrozdov.com

RAG still has a way to go. (this book doesn’t exist)

December 4, 2024 at 1:42 PM

Andrew Drozdov

@mrdrozdov.com

Similar comment from a reliable source.

x.com/earnmyturns/...

@earmyturns (Fernando Pereira) on twitter: It isn't. Vector encodings of sentences predated transformers. Self-attention predated transformers too. Maybe someone said that as a joke, but the transformer idea came from computational efficiency considerations together with preexisting techniques.

December 2, 2024 at 6:00 PM

Andrew Drozdov

@mrdrozdov.com

@thomlake.bsky.social I'm willing to be convinced. This would give me a whole new appreciation for the memory net work. Can you show that memory nets could process the whole sequence in parallel w/o a for-loop? IMO this is the key capability that self-attention enables.

December 2, 2024 at 4:13 PM

Andrew Drozdov

@mrdrozdov.com

It's worth noting the authors of the decomposable attention paper all did very well :) One of them (Jakob Uszkoreit) is also a key co-author on AIAYN

> Jakob proposed replacing RNNs with self-attention and started the effort to evaluate this idea

first page of AIAYN with the key attribution highlighted

December 2, 2024 at 4:18 AM

Andrew Drozdov

@mrdrozdov.com

lol nice

youtube giving a warning after I clicked on a bluesky link

November 27, 2024 at 2:39 PM

Andrew Drozdov

@mrdrozdov.com

unless you're a hawk, then it's at least 20/5

November 26, 2024 at 7:55 PM

Andrew Drozdov

@mrdrozdov.com

Fun fact: you used to be able to get a Neurips acceptance with only two references included. Two!

November 26, 2024 at 7:47 PM

Andrew Drozdov

@mrdrozdov.com

AI researchers were already investigating scaling laws for training neural networks in 1993.

proceedings.neurips.cc/paper/1993/h...

November 26, 2024 at 7:41 PM

Andrew Drozdov

@mrdrozdov.com

This is not necessarily a dunk. Blog posts are pretty great: open access, self-hosted (low cost), and often has discussion built-in.

The dark/light bus meme with text “this paper should have been a blog post” at the top.

November 25, 2024 at 1:37 PM

Andrew Drozdov

@mrdrozdov.com

researcher: make sure you train on lots of data

AI model: I’m sorry, but I’m a large language model, and I don’t have access to the internet or any external information sources. I can only generate responses based on the text that I was trained on, which has a knowledge cutoff of 2021.

November 25, 2024 at 12:01 AM

Andrew Drozdov

@mrdrozdov.com

Wait. thought this was a last week release that I missed. I guess it's from last year lol?

November 24, 2024 at 11:45 PM

Andrew Drozdov

@mrdrozdov.com

IMO, query expansion can work but likely needs different approach. For example this fusion-like technique from Li et al., 2024. arxiv.org/abs/2311.09175

November 24, 2024 at 11:29 PM

Andrew Drozdov

@mrdrozdov.com

Research like this is so important. A lot of decision making when deploying RAG systems is based on vibes-based eval using dated models and datasets. Many techniques simply don't transfer to modern models like you'd hope. Query expansion seems to fall into this category.