Lightnews — Scholar-powered news

LightNews

Sumit

@reachsumit.com

190 followers 36 following 1.8K posts

Senior MLE at Meta. Trying to keep up with the Information Retrieval domain! Blog: https://blog.reachsumit.com/ Newsletter: https://recsys.substack.com/

blog.reachsumit.com

Posts Media Videos Starter Packs

Pinned

Sumit @reachsumit.com · 8d

LLMs often don't know what they don't know. They'll confidently generate wrong answers rather than admit uncertainty. This miscalibration makes it difficult to determine when external retrieval is actually necessary.

blog.reachsumit.com/posts/2025/0...

Probing LLMs' Knowledge Boundary: Adaptive RAG, Part 3

This post introduces techniques that probe the LLM’s internal confidence and knowledge boundaries. We explore prompt-based confidence detection, consistency-based uncertainty estimation, and internal ...

blog.reachsumit.com

Sumit @reachsumit.com · 14h

Embedding-Based Context-Aware Reranker

Proposes a lightweight reranking framework that operates on passage embeddings with hybrid attention to capture cross-document and within-document relationships for improved retrieval.

📝 arxiv.org/abs/2510.13329

Embedding-Based Context-Aware Reranker

Retrieval-Augmented Generation (RAG) systems rely on retrieving relevant evidence from a corpus to support downstream generation. The common practice of splitting a long document into multiple shorter...

Sumit @reachsumit.com · 14h

Retrieval-in-the-Chain: Bootstrapping Large Language Models for Generative Retrieval

Proposes a reasoning-augmented framework that converts CoT reasoning into structured format and iteratively refines it during retrieval for improved generative retrieval performance.

📝 arxiv.org/abs/2510.13095

Retrieval-in-the-Chain: Bootstrapping Large Language Models for Generative Retrieval

Generative retrieval (GR) is an emerging paradigm that leverages large language models (LLMs) to autoregressively generate document identifiers (docids) relevant to a given query. Prior works have foc...

Sumit @reachsumit.com · 14h

Grounding Long-Context Reasoning with Contextual Normalization for Retrieval-Augmented Generation

Introduces a lightweight strategy that adaptively standardizes context format to improve long-context RAG performance and robustness.

📝 arxiv.org/abs/2510.13191

Grounding Long-Context Reasoning with Contextual Normalization for Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) has become an essential approach for extending the reasoning and knowledge capacity of large language models (LLMs). While prior research has primarily focused on ...

Sumit @reachsumit.com · 14h

LLM-Guided Hierarchical Retrieval

@nileshgupta2797 et al. enable LLMs to navigate document collections through semantic tree structures with logarithmic search complexity for reasoning-intensive retrieval tasks.

📝 arxiv.org/abs/2510.13217
👨🏽‍💻 github.com/nilesh2797/l...

LLM-guided Hierarchical Retrieval

Modern IR systems are increasingly tasked with answering complex, multi-faceted queries that require deep reasoning rather than simple keyword or semantic matching. While LLM-based IR has shown great ...

Sumit @reachsumit.com · 14h

Improving Visual Recommendation on E-commerce Platforms Using Vision-Language Models

Mercari fine-tunes SigLIP on product image-title pairs to get 9.1% offline improvement and 50% CTR increase in production for visual similarity-based recommendations.

📝 arxiv.org/abs/2510.13359

Improving Visual Recommendation on E-commerce Platforms Using Vision-Language Models

On large-scale e-commerce platforms with tens of millions of active monthly users, recommending visually similar products is essential for enabling users to efficiently discover items that align with ...

Sumit @reachsumit.com · 14h

HyMiRec: A Hybrid Multi-interest Learning Framework for LLM-based Sequential Recommendation

Combines lightweight and LLM-based recommenders to capture users' diverse long-term and short-term interests through disentangled multi-interest learning.

📝 arxiv.org/abs/2510.13738

HyMiRec: A Hybrid Multi-interest Learning Framework for LLM-based Sequential Recommendation

Large language models (LLMs) have recently demonstrated strong potential for sequential recommendation. However, current LLM-based approaches face critical limitations in modeling users' long-term and...

Sumit @reachsumit.com · 1d

RAG-Anything: All-in-One RAG Framework

Introduces a unified framework enabling comprehensive knowledge retrieval across all modalities through dual-graph construction and cross-modal hybrid retrieval.

📝 arxiv.org/abs/2510.12323
👨🏽‍💻 github.com/HKUDS/RAG-An...

RAG-Anything: All-in-One RAG Framework

Retrieval-Augmented Generation (RAG) has emerged as a fundamental paradigm for expanding Large Language Models beyond their static training limitations. However, a critical misalignment exists between...

Sumit @reachsumit.com · 1d

A Longitudinal Study on Different Annotator Feedback Loops in Complex RAG Tasks

Compares internal and external annotator groups over one year, finding that closer feedback loops create higher quality data with decreased quantity and diversity.

📝 arxiv.org/abs/2510.11897

A Longitudinal Study on Different Annotator Feedback Loops in Complex RAG Tasks

Grounding conversations in existing passages, known as Retrieval-Augmented Generation (RAG), is an important aspect of Chat-Based Assistants powered by Large Language Models (LLMs) to ensure they are ...

Sumit @reachsumit.com · 1d

Evaluating Retrieval-Augmented Generation Systems on Unanswerable, Uncheatable, Realistic, Multi-hop Queries

Introduces a pipeline for creating complex multi-hop RAG queries that are unanswerable and resistant to shortcuts.

📝 arxiv.org/abs/2510.11956

Evaluating Retrieval-Augmented Generation Systems on Unanswerable, Uncheatable, Realistic, Multi-hop Queries

Real-world use cases often present RAG systems with complex queries for which relevant information is missing from the corpus or is incomplete. In these settings, RAG systems must be able to reject un...

Sumit @reachsumit.com · 1d

Reinforced Preference Optimization for Recommendation

Introduces an RL framework for LLM-based recommenders that uses constrained beam search and ranking rewards to improve negative sampling and improve ranking performance.

📝 arxiv.org/abs/2510.12211
👨🏽‍💻 github.com/sober-clever...

Reinforced Preference Optimization for Recommendation

Recent breakthroughs in large language models (LLMs) have fundamentally shifted recommender systems from discriminative to generative paradigms, where user behavior modeling is achieved by generating ...

Sumit @reachsumit.com · 1d

Simple Projection Variants Improve ColBERT Performance

Shows that replacing ColBERT's single-layer linear projection with deeper FFNs featuring residual connections and upscaled intermediate projections improves retrieval performance.

📝 arxiv.org/abs/2510.12327

Simple Projection Variants Improve ColBERT Performance

Multi-vector dense retrieval methods like ColBERT systematically use a single-layer linear projection to reduce the dimensionality of individual vectors. In this study, we explore the implications of ...

Sumit @reachsumit.com · 1d

Probing Latent Knowledge Conflict for Faithful Retrieval-Augmented Generation

Proposes a framework that uses hidden-state probing to detect knowledge conflicts in RAG systems and introduces conflict-aware fine-tuning.

📝 arxiv.org/abs/2510.12460
👨🏽‍💻 github.com/LinfengGao/C...

Probing Latent Knowledge Conflict for Faithful Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm to enhance the factuality of Large Language Models (LLMs). However, existing RAG systems often suffer from an unfaithfulness iss...

Sumit @reachsumit.com · 1d

SMEC: Rethinking Matryoshka Representation Learning for Retrieval Embedding Compression

Alibaba proposes Sequential Matryoshka Embedding Compression to reduce gradient variance during training, and adaptively select important dimensions.

📝 arxiv.org/abs/2510.12474

SMEC: Rethinking Matryoshka Representation Learning for Retrieval Embedding Compression

Large language models (LLMs) generate high-dimensional embeddings that capture rich semantic and syntactic information. However, high-dimensional embeddings exacerbate computational complexity and sto...

Sumit @reachsumit.com · 1d

SMILE: SeMantic Ids Enhanced CoLd Item Representation for Click-through Rate Prediction in E-commerce SEarch

Kuaishou uses RQ-OPQ encoding to enhance cold-start item representations by aligning collaborative signals with semantic information.

📝 arxiv.org/abs/2510.12604

SMILE: SeMantic Ids Enhanced CoLd Item Representation for Click-through Rate Prediction in E-commerce SEarch

With the rise of modern search and recommendation platforms, insufficient collaborative information of cold-start items exacerbates the Matthew effect of existing platform items, challenging platform ...

Sumit @reachsumit.com · 1d

The Role of Parametric Injection-A Systematic Study of Parametric Retrieval-Augmented Generation

Finds that parametric representations capture only partial semantic information but can enhance document understanding when combined with textual context.

📝 arxiv.org/abs/2510.12668

The Role of Parametric Injection-A Systematic Study of Parametric Retrieval-Augmented Generation

Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving external documents. As an emerging form of RAG, parametric retrieval-augmented generation (PRAG) encodes docume...

Sumit @reachsumit.com · 1d

SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model

ByteDance presents a foundation model that supports multifaceted multimodal retrieval and classification by accommodating arbitrary modality inputs, including text, vision, and audio.

📝 arxiv.org/abs/2510.12709

SAIL-Embedding Technical Report: Omni-modal Embedding Foundation Model

Multimodal embedding models aim to yield informative unified representations that empower diverse cross-modal tasks. Despite promising developments in the evolution from CLIP-based dual-tower architec...

Sumit @reachsumit.com · 1d

CTRL-Rec: Controlling Recommender Systems With Natural Language

Introduces a method that allows natural language control of traditional recommender systems in real-time with computational efficiency.

📝 arxiv.org/abs/2510.12742

CTRL-Rec: Controlling Recommender Systems With Natural Language

When users are dissatisfied with recommendations from a recommender system, they often lack fine-grained controls for changing them. Large language models (LLMs) offer a solution by allowing users to ...

Sumit @reachsumit.com · 1d

DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search

Apple introduces a multimodal LLM capable of on-demand, multi-turn web searches with dynamic query generation for image, text, and audio search tools.

📝 arxiv.org/abs/2510.12801

DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search

Multimodal Large Language Models (MLLMs) in real-world applications require access to external knowledge sources and must remain responsive to the dynamic and ever-changing real-world information in o...

Sumit @reachsumit.com · 2d

Table Question Answering in the Era of Large Language Models: A Comprehensive Survey of Tasks, Methods, and Evaluation

Surveys table question answering approaches in the LLM era, categorizing task setups, modeling strategies, and evaluation methods.

📝 arxiv.org/abs/2510.09671

Table Question Answering in the Era of Large Language Models: A Comprehensive Survey of Tasks, Methods, and Evaluation

Table Question Answering (TQA) aims to answer natural language questions about tabular data, often accompanied by additional contexts such as text passages. The task spans diverse settings, varying in...

Sumit @reachsumit.com · 2d

HUME: Measuring the Human-Model Performance Gap in Text Embedding Tasks

Introduces a framework to evaluate human performance on text embedding benchmarks, revealing that humans rank 4th among models with 77.6% average performance across 16 MTEB tasks.

📝 arxiv.org/abs/2510.10062

HUME: Measuring the Human-Model Performance Gap in Text Embedding Task

Comparing human and model performance offers a valuable perspective for understanding the strengths and limitations of embedding models, highlighting where they succeed and where they fail to capture ...

Sumit @reachsumit.com · 2d

Domain-Specific Data Generation Framework for RAG Adaptation

Introduces a scalable framework for generating domain-grounded question-answer-context triples to enhance RAG system adaptation across diverse domains and components.

📝 arxiv.org/abs/2510.11217

Domain-Specific Data Generation Framework for RAG Adaptation

Retrieval-Augmented Generation (RAG) combines the language understanding and reasoning power of large language models (LLMs) with external retrieval to enable domain-grounded responses. Effectively ad...

Sumit @reachsumit.com · 2d

Differentiable Fast Top-K Selection for Large-Scale Recommendation

Kuaishou introduces a differentiable Top-K operator with linear O(n) time complexity for cascade ranking systems.

📝 arxiv.org/abs/2510.11472
👨🏽‍💻 github.com/zhangzhen97/...

Differentiable Fast Top-K Selection for Large-Scale Recommendation

Cascade ranking is a widely adopted paradigm in large-scale information retrieval systems for Top-K item selection. However, the Top-K operator is non-differentiable, hindering end-to-end training. Ex...

Sumit @reachsumit.com · 2d

MTMD: A Multi-Task Multi-Domain Framework for Unified Ad Lightweight Ranking at Pinterest

Pinterest introduces a two-tower architecture that unifies multiple ad domains and optimization tasks using mixture-of-experts, replacing 9 production models.

📝 arxiv.org/abs/2510.09857

MTMD: A Multi-Task Multi-Domain Framework for Unified Ad Lightweight Ranking at Pinterest

The lightweight ad ranking layer, living after the retrieval stage and before the fine ranker, plays a critical role in the success of a cascaded ad recommendation system. Due to the fact that there a...

Sumit @reachsumit.com · 2d

Beyond the limitation of a single query: Train your LLM for query expansion with Reinforcement Learning

NVIDIA introduces trains LLM-based search agents to generate multiple query variants simultaneously.

📝 arxiv.org/abs/2510.10009
👨🏽‍💻 shuzhao.me/ExpandSearch...

Beyond the limitation of a single query: Train your LLM for query expansion with Reinforcement Learning

Reasoning-augmented search agents, such as Search-R1, are trained to reason, search, and generate the final answer iteratively. Nevertheless, due to their limited capabilities in reasoning and search,...

Sumit @reachsumit.com · 2d

LinearRAG: Linear Graph Retrieval Augmented Generation on Large-Scale Corpora

Constructs relation-free hierarchical graphs using lightweight entity extraction, reducing indexing time by over 77% while outperforming existing GraphRAG methods.

📝 arxiv.org/abs/2510.10114

LinearRAG: Linear Graph Retrieval Augmented Generation on Large-scale Corpora

Retrieval-Augmented Generation (RAG) is widely used to mitigate hallucinations of Large Language Models (LLMs) by leveraging external knowledge. While effective for simple queries, traditional RAG sys...