aakriti1kumar.bsky.social
@aakriti1kumar.bsky.social
How do we reliably judge if AI companions are performing well on subjective, context-dependent, and deeply human tasks? 🤖

Excited to share the first paper from my postdoc (!!) investigating when LLMs are reliable judges - with empathic communication as a case study 🧐

🧵👇
June 17, 2025 at 3:14 PM
Super cool opportunity to work with brilliant scientists and fantastic mentors @mattgroh.bsky.social and Dashun Wang 🌟🌟

Feel free to reach out!
📣 📣 Postdoc Opportunity at Northwestern

Dashun Wang and I are seeking a creative, technical, interdisciplinary researcher for a joint postdoc fellowship between our labs.

If you're passionate about Human-AI Collaboration and Science of Science, this may be for you! 🚀

Please share widely!
April 2, 2025 at 2:35 PM
Reposted
Our paper: Decision-Point Guided Safe Policy Improvement
We show that a simple approach to learn safe RL policies can outperform most offline RL methods. (+theoretical guarantees!)

How? Just allow the state-actions that have been seen enough times! 🤯

arxiv.org/abs/2410.09361
Decision-Point Guided Safe Policy Improvement
Within batch reinforcement learning, safe policy improvement (SPI) seeks to ensure that the learnt policy performs at least as well as the behavior policy that generated the dataset. The core challeng...
arxiv.org
January 23, 2025 at 6:23 PM