momergul.bsky.social
@momergul.bsky.social
CS PhD Student @Cornell
Was a challenge getting everything to fit 🙈
October 2, 2025 at 8:02 PM
Work done with my great advisors Claire Cardie & Tanya Goyal.

Paper: arxiv.org/abs/2510.01152
Github link for code and checkpoints: github.com/momergul/mash
Pay-Per-Search Models are Abstention Models
LLMs cannot reliably recognize their parametric knowledge boundaries and often hallucinate answers to outside-of-boundary questions. In contrast, humans recognize their limitations and can either seek...
arxiv.org
October 2, 2025 at 7:40 PM
Tons of other insights in the paper. We show that the strength of the helper / search tool is a key consideration. Replacing our retriever with an oracle results in all models converging to always seeking help. The noisiness of the retriever is a feature not a bug!
October 2, 2025 at 7:40 PM
Baseline RL implementations often converge to sub-optimal policies that always or never search. MASH uses a lightweight warm start data generation & SFT pipeline that induces better search behaviors. MASH models can discover a mix of 0/1/2 searches as needed while baselines fail.
October 2, 2025 at 7:40 PM
For (ii), MASH shows strong abstention behavior off-the-shelf! Its performance is analogous to abstention baselines that require pre-determining knowledge boundaries and model-specific training data. It beats SFT approaches and is competitive with DPO!
October 2, 2025 at 7:40 PM
We evaluate MASH under 2 settings: (i) w/ access to search, (ii) w/o search as an abstention model.

For (i), MASH outperforms efficient search baselines, esp. for multi-hop datasets (7.6% accuracy boost), even matching search baselines w/o any search penalties!
October 2, 2025 at 7:40 PM
💡Key idea: Reward accuracy but penalize searches during training. Under the right optimization pressure, LLMs learn to invoke search when their parametric knowledge is lacking. At inference, we simply remove this search access and treat any search invocation as a proxy for abstention!
October 2, 2025 at 7:40 PM