Mingxuan (Aldous) Li
banner
itea1001.bsky.social
Mingxuan (Aldous) Li
@itea1001.bsky.social
https://itea1001.github.io/
Rising third-year undergrad at the University of Chicago, working on LLM tool use, evaluation, and hypothesis generation.
#ACL2025 Poster Session 1 tomorrow 11:00-12:30 Hall 4/5!
July 27, 2025 at 7:27 PM
7/n 💪 What’s robust?
✅ Works across out-of-distribution (OOD) tasks
✅ Generated hypothesis can be transferred to different LLMs (e.g., GPT-4o-mini ↔ LLAMA-3.3-70B)
✅ Reduces sensitivity to prompt variations compared to direct scoring
May 12, 2025 at 7:25 PM
4/n These hypotheses break down complex evaluation rubric (ex. “Is this summary comprehensive?”) into sub-dimensions an LLM can score clearly. ✅✅✅
May 12, 2025 at 7:24 PM
3/n 🌟 Our solution: HypoEval
Building upon SOTA hypothesis generation methods, we generate hypotheses — decomposed rubrics (similar to checklists, but more systematic and explainable) — from existing literature and just 30 human annotations (scores) of texts.
May 12, 2025 at 7:24 PM