Sumit
@reachsumit.com
190 followers 36 following 1.8K posts
Senior MLE at Meta. Trying to keep up with the Information Retrieval domain! Blog: https://blog.reachsumit.com/ Newsletter: https://recsys.substack.com/
Posts Media Videos Starter Packs
Embedding-Based Context-Aware Reranker

Proposes a lightweight reranking framework that operates on passage embeddings with hybrid attention to capture cross-document and within-document relationships for improved retrieval.

📝 arxiv.org/abs/2510.13329
Embedding-Based Context-Aware Reranker
Retrieval-Augmented Generation (RAG) systems rely on retrieving relevant evidence from a corpus to support downstream generation. The common practice of splitting a long document into multiple shorter...
arxiv.org
Retrieval-in-the-Chain: Bootstrapping Large Language Models for Generative Retrieval

Proposes a reasoning-augmented framework that converts CoT reasoning into structured format and iteratively refines it during retrieval for improved generative retrieval performance.

📝 arxiv.org/abs/2510.13095
Retrieval-in-the-Chain: Bootstrapping Large Language Models for Generative Retrieval
Generative retrieval (GR) is an emerging paradigm that leverages large language models (LLMs) to autoregressively generate document identifiers (docids) relevant to a given query. Prior works have foc...
arxiv.org
LLM-Guided Hierarchical Retrieval

@nileshgupta2797 et al. enable LLMs to navigate document collections through semantic tree structures with logarithmic search complexity for reasoning-intensive retrieval tasks.

📝 arxiv.org/abs/2510.13217
👨🏽‍💻 github.com/nilesh2797/l...
LLM-guided Hierarchical Retrieval
Modern IR systems are increasingly tasked with answering complex, multi-faceted queries that require deep reasoning rather than simple keyword or semantic matching. While LLM-based IR has shown great ...
arxiv.org
Improving Visual Recommendation on E-commerce Platforms Using Vision-Language Models

Mercari fine-tunes SigLIP on product image-title pairs to get 9.1% offline improvement and 50% CTR increase in production for visual similarity-based recommendations.

📝 arxiv.org/abs/2510.13359
Improving Visual Recommendation on E-commerce Platforms Using Vision-Language Models
On large-scale e-commerce platforms with tens of millions of active monthly users, recommending visually similar products is essential for enabling users to efficiently discover items that align with ...
arxiv.org
HyMiRec: A Hybrid Multi-interest Learning Framework for LLM-based Sequential Recommendation

Combines lightweight and LLM-based recommenders to capture users' diverse long-term and short-term interests through disentangled multi-interest learning.

📝 arxiv.org/abs/2510.13738
HyMiRec: A Hybrid Multi-interest Learning Framework for LLM-based Sequential Recommendation
Large language models (LLMs) have recently demonstrated strong potential for sequential recommendation. However, current LLM-based approaches face critical limitations in modeling users' long-term and...
arxiv.org
RAG-Anything: All-in-One RAG Framework

Introduces a unified framework enabling comprehensive knowledge retrieval across all modalities through dual-graph construction and cross-modal hybrid retrieval.

📝 arxiv.org/abs/2510.12323
👨🏽‍💻 github.com/HKUDS/RAG-An...
RAG-Anything: All-in-One RAG Framework
Retrieval-Augmented Generation (RAG) has emerged as a fundamental paradigm for expanding Large Language Models beyond their static training limitations. However, a critical misalignment exists between...
arxiv.org
A Longitudinal Study on Different Annotator Feedback Loops in Complex RAG Tasks

Compares internal and external annotator groups over one year, finding that closer feedback loops create higher quality data with decreased quantity and diversity.

📝 arxiv.org/abs/2510.11897
A Longitudinal Study on Different Annotator Feedback Loops in Complex RAG Tasks
Grounding conversations in existing passages, known as Retrieval-Augmented Generation (RAG), is an important aspect of Chat-Based Assistants powered by Large Language Models (LLMs) to ensure they are ...
arxiv.org
Reinforced Preference Optimization for Recommendation

Introduces an RL framework for LLM-based recommenders that uses constrained beam search and ranking rewards to improve negative sampling and improve ranking performance.

📝 arxiv.org/abs/2510.12211
👨🏽‍💻 github.com/sober-clever...
Reinforced Preference Optimization for Recommendation
Recent breakthroughs in large language models (LLMs) have fundamentally shifted recommender systems from discriminative to generative paradigms, where user behavior modeling is achieved by generating ...
arxiv.org
Simple Projection Variants Improve ColBERT Performance

Shows that replacing ColBERT's single-layer linear projection with deeper FFNs featuring residual connections and upscaled intermediate projections improves retrieval performance.

📝 arxiv.org/abs/2510.12327
Simple Projection Variants Improve ColBERT Performance
Multi-vector dense retrieval methods like ColBERT systematically use a single-layer linear projection to reduce the dimensionality of individual vectors. In this study, we explore the implications of ...
arxiv.org
SMEC: Rethinking Matryoshka Representation Learning for Retrieval Embedding Compression

Alibaba proposes Sequential Matryoshka Embedding Compression to reduce gradient variance during training, and adaptively select important dimensions.

📝 arxiv.org/abs/2510.12474
SMEC: Rethinking Matryoshka Representation Learning for Retrieval Embedding Compression
Large language models (LLMs) generate high-dimensional embeddings that capture rich semantic and syntactic information. However, high-dimensional embeddings exacerbate computational complexity and sto...
arxiv.org
The Role of Parametric Injection-A Systematic Study of Parametric Retrieval-Augmented Generation

Finds that parametric representations capture only partial semantic information but can enhance document understanding when combined with textual context.

📝 arxiv.org/abs/2510.12668
The Role of Parametric Injection-A Systematic Study of Parametric Retrieval-Augmented Generation
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving external documents. As an emerging form of RAG, parametric retrieval-augmented generation (PRAG) encodes docume...
arxiv.org
SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model

ByteDance presents a foundation model that supports multifaceted multimodal retrieval and classification by accommodating arbitrary modality inputs, including text, vision, and audio.

📝 arxiv.org/abs/2510.12709
SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model
Multimodal embedding models aim to yield informative unified representations that empower diverse cross-modal tasks. Despite promising developments in the evolution from CLIP-based dual-tower architec...
arxiv.org
DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search

Apple introduces a multimodal LLM capable of on-demand, multi-turn web searches with dynamic query generation for image, text, and audio search tools.

📝 arxiv.org/abs/2510.12801
DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search
Multimodal Large Language Models (MLLMs) in real-world applications require access to external knowledge sources and must remain responsive to the dynamic and ever-changing real-world information in o...
arxiv.org
HUME: Measuring the Human-Model Performance Gap in Text Embedding Tasks

Introduces a framework to evaluate human performance on text embedding benchmarks, revealing that humans rank 4th among models with 77.6% average performance across 16 MTEB tasks.

📝 arxiv.org/abs/2510.10062
HUME: Measuring the Human-Model Performance Gap in Text Embedding Task
Comparing human and model performance offers a valuable perspective for understanding the strengths and limitations of embedding models, highlighting where they succeed and where they fail to capture ...
arxiv.org
Domain-Specific Data Generation Framework for RAG Adaptation

Introduces a scalable framework for generating domain-grounded question-answer-context triples to enhance RAG system adaptation across diverse domains and components.

📝 arxiv.org/abs/2510.11217
Domain-Specific Data Generation Framework for RAG Adaptation
Retrieval-Augmented Generation (RAG) combines the language understanding and reasoning power of large language models (LLMs) with external retrieval to enable domain-grounded responses. Effectively ad...
arxiv.org
MTMD: A Multi-Task Multi-Domain Framework for Unified Ad Lightweight Ranking at Pinterest

Pinterest introduces a two-tower architecture that unifies multiple ad domains and optimization tasks using mixture-of-experts, replacing 9 production models.

📝 arxiv.org/abs/2510.09857
MTMD: A Multi-Task Multi-Domain Framework for Unified Ad Lightweight Ranking at Pinterest
The lightweight ad ranking layer, living after the retrieval stage and before the fine ranker, plays a critical role in the success of a cascaded ad recommendation system. Due to the fact that there a...
arxiv.org
LinearRAG: Linear Graph Retrieval Augmented Generation on Large-Scale Corpora

Constructs relation-free hierarchical graphs using lightweight entity extraction, reducing indexing time by over 77% while outperforming existing GraphRAG methods.

📝 arxiv.org/abs/2510.10114
LinearRAG: Linear Graph Retrieval Augmented Generation on Large-scale Corpora
Retrieval-Augmented Generation (RAG) is widely used to mitigate hallucinations of Large Language Models (LLMs) by leveraging external knowledge. While effective for simple queries, traditional RAG sys...
arxiv.org