Lightnews — Scholar-powered news

Sara Rosenthal

@seirasto.bsky.social

📣📣Presenting our platform used to build MTRAG!!

RAGAPHENE: A RAG Annotation Platform with Human ENhancements and Edits

Arxiv: arxiv.org/abs/2508.19272
MTRAG GitHub: github.com/IBM/mt-rag-b...
Join our MTRAGEval Task: ibm.github.io/mt-rag-bench...

RAGAPHENE: A RAG Annotation Platform with Human Enhancements and Edits

Retrieval Augmented Generation (RAG) is an important aspect of conversing with Large Language Models (LLMs) when factually correct information is important. LLMs may provide answers that appear correc...

arxiv.org

August 28, 2025 at 12:47 PM

Sara Rosenthal

@seirasto.bsky.social

🚀Excited to announce our MTRAGEval task at SemEval 2026!

Arxiv: arxiv.org/abs/2501.03468
Github: github.com/IBM/mt-rag-b... (please 🌟!)
MTRAGEval: ibm.github.io/mt-rag-bench...

MTRAG: A Multi-Turn Conversational Benchmark for Evaluating Retrieval-Augmented Generation Systems

Retrieval-augmented generation (RAG) has recently become a very popular task for Large Language Models (LLMs). Evaluating them on multi-turn RAG conversations, where the system is asked to generate a ...

arxiv.org

August 4, 2025 at 6:33 AM

Sara Rosenthal

@seirasto.bsky.social

Working on RAG? Come check out our InspectorRAGet DEMO presented by Siva Sankalp Patel May 2 (Friday), 11-12:30 at Demo Session 8 in Hall 3! Looking forward to attending ACL in a few months! #NAACL2025 @naaclmeeting.bsky.social

paper: arxiv.org/abs/2404.17347
github: github.com/IBM/Inspecto...

InspectorRAGet: An Introspection Platform for RAG Evaluation

Large Language Models (LLM) have become a popular approach for implementing Retrieval Augmented Generation (RAG) systems, and a significant amount of effort has been spent on building good models and ...

arxiv.org

May 1, 2025 at 1:24 AM

Sara Rosenthal

@seirasto.bsky.social

Excited about this collab! Come check out FeeL and help advance multilingual generation in your language! huggingface.co/spaces/feel-...

March 26, 2025 at 1:59 PM

Sara Rosenthal

@seirasto.bsky.social

🌟Want to know more about our MTRAG benchmark? Check out the IBM blog highlighting our work! research.ibm.com/blog/convers...

How well can your RAG agent carry out a conversation?

IBM’s new benchmark evaluates LLMs on interactive question-answering tasks using

research.ibm.com

February 4, 2025 at 8:41 PM

Sara Rosenthal

@seirasto.bsky.social

🌟 New Benchmark! 🌟

Do you work on RAG? Are you interested in Multi-Turn conversations? Very excited to share the new MTRAG benchmark we have released!

Data: github.com/ibm/mt-rag-b...
Paper: arxiv.org/abs/2501.03468

GitHub - IBM/mt-rag-benchmark: Multi-Turn RAG Benchmark

Multi-Turn RAG Benchmark. Contribute to IBM/mt-rag-benchmark development by creating an account on GitHub.

github.com

January 8, 2025 at 8:08 PM

Sara Rosenthal

@seirasto.bsky.social

Anyone else feel like Google scholar is missing citations lately? I have a recent paper that has 8 citations on semantic scholar and only 3 on Google scholar…. and I have two papers that are cited in one paper but only one has the citation 🤔

November 27, 2024 at 1:47 AM

Reposted by Sara Rosenthal

Ramon Astudillo

@ramon-astudillo.bsky.social

I did a starter pack of people in New York (City) working on ML/AI. Please distribute and feel free to self nominate!

go.bsky.app/BoEtagz

November 19, 2024 at 1:38 AM

Sara Rosenthal

@seirasto.bsky.social

If you work on RAG check out InspectorRAGet - an awesome RAG tool for evaluation. Available on HuggingFace! We provide the interface, you provide the experiments and metrics. Want to know more? Just reach out!
github.com/IBM/Inspecto...
huggingface.co/spaces/kpfad...
arxiv.org/abs/2404.17347