Robin Jia
@robinjia.bsky.social
Assistant Professor in Computer Science at USC | NLP, ML
Paper link: arxiv.org/abs/2501.14883
Verify with Caution: The Pitfalls of Relying on Imperfect Factuality Metrics
Improvements in large language models have led to increasing optimism that they can serve as reliable evaluators of natural language generation outputs. In this paper, we challenge this optimism by th...
arxiv.org
July 30, 2025 at 8:16 AM
Paper link: arxiv.org/abs/2501.14883
Sounds like arxiv.org/abs/2102.07033
PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them
Open-domain Question Answering models which directly leverage question-answer (QA) pairs, such as closed-book QA (CBQA) models and QA-pair retrievers, show promise in terms of speed and memory compare...
arxiv.org
February 17, 2025 at 6:07 PM
Sounds like arxiv.org/abs/2102.07033
Links & presentation times:
1. Fourier Features: arxiv.org/abs/2406.03445 Thu, 4:30pm
2. TF + ICL: arxiv.org/abs/2310.17086 Fri, 11am
3. Backdoor detection: arxiv.org/abs/2409.00399 Sat, 1:44pm at AdvML Frontiers
4. LLMs + PDDL: arxiv.org/abs/2406.02791 Sun, 2:30pm at OWA workshop
1. Fourier Features: arxiv.org/abs/2406.03445 Thu, 4:30pm
2. TF + ICL: arxiv.org/abs/2310.17086 Fri, 11am
3. Backdoor detection: arxiv.org/abs/2409.00399 Sat, 1:44pm at AdvML Frontiers
4. LLMs + PDDL: arxiv.org/abs/2406.02791 Sun, 2:30pm at OWA workshop
Pre-trained Large Language Models Use Fourier Features to Compute Addition
Pre-trained large language models (LLMs) exhibit impressive mathematical reasoning capabilities, yet how they compute basic arithmetic, such as addition, remains unclear. This paper shows that pre-tra...
arxiv.org
December 9, 2024 at 10:21 PM
Links & presentation times:
1. Fourier Features: arxiv.org/abs/2406.03445 Thu, 4:30pm
2. TF + ICL: arxiv.org/abs/2310.17086 Fri, 11am
3. Backdoor detection: arxiv.org/abs/2409.00399 Sat, 1:44pm at AdvML Frontiers
4. LLMs + PDDL: arxiv.org/abs/2406.02791 Sun, 2:30pm at OWA workshop
1. Fourier Features: arxiv.org/abs/2406.03445 Thu, 4:30pm
2. TF + ICL: arxiv.org/abs/2310.17086 Fri, 11am
3. Backdoor detection: arxiv.org/abs/2409.00399 Sat, 1:44pm at AdvML Frontiers
4. LLMs + PDDL: arxiv.org/abs/2406.02791 Sun, 2:30pm at OWA workshop