Michelle Lam
banner
mlam.bsky.social
Michelle Lam
@mlam.bsky.social
Stanford CS PhD student | hci, human-centered AI, social computing, responsible AI (+ dance, design, doodling!)
michelle123lam.github.io
We can extend policy maps to enable Git-style collaboration and forking, aid live deliberation, and support longitudinal policy test suites & third-party audits. Policy maps can transform a nebulous space of model possibilities to an explicit specification of model behavior.
September 29, 2025 at 3:54 PM
With our system, LLM safety experts rapidly discovered policy gaps and crafted new policies around problematic model behavior (e.g., incorrectly assuming genders; repeating hurtful names in summaries; blocking physical safety threats that a user needs to be able to monitor).
September 29, 2025 at 3:54 PM
Given the unbounded space of LLM behaviors, developers need tools that concretize the subjective decisionmaking inherent to policy design. They should have a visual space to systematically explore, with explicit conceptual links between lofty principles and grounded examples.
September 29, 2025 at 3:54 PM
Our system creates linked map layers of cases, concepts, & policies: so an AI developer can author a policy that blocks model responses involving violence, visually notice a gap of physical threats that a user ought to be aware of, and test a revised policy to address this gap.
September 29, 2025 at 3:54 PM
LLM safety work often reasons over high-level policies (be helpful & polite), but must tackle on-the-ground cases (unsolicited money advice when stocks are mentioned). This can feel like driving on an unfamiliar road guided by a generic driver’s manual instead of a map. We introduce: Policy Maps 🗺️
September 29, 2025 at 3:54 PM