website: hayoungjung.me
Happy to chat or grab coffee at the conference! Feel free to DM me :)
Happy to chat or grab coffee at the conference! Feel free to DM me :)
To support this, we’re releasing everything:
🧠 Models: huggingface.co/SocialCompUW...
💻 Code: github.com/hayoungjungg...
📊 Data: github.com/hayoungjungg...
To support this, we’re releasing everything:
🧠 Models: huggingface.co/SocialCompUW...
💻 Code: github.com/hayoungjungg...
📊 Data: github.com/hayoungjungg...
Past work tested model cascades on standard benchmarks (e.g., SQuAD). We validate them in the wild!
Past work tested model cascades on standard benchmarks (e.g., SQuAD). We validate them in the wild!
👩⚕️Public health: Inform targeted interventions & debunk myths.
🛡️Platforms: Provides a scalable auditing pipeline to flag high-risk content & improve moderation.
👩⚕️Public health: Inform targeted interventions & debunk myths.
🛡️Platforms: Provides a scalable auditing pipeline to flag high-risk content & improve moderation.
➡️12.7% of recs from myth videos led to more myths initially—rising to 22% at deeper levels.
⚠️ Moderation should target these rec pathways that reinforce harmful myths.
➡️12.7% of recs from myth videos led to more myths initially—rising to 22% at deeper levels.
⚠️ Moderation should target these rec pathways that reinforce harmful myths.
😬A few clicks can change your exposure to myths!
😬A few clicks can change your exposure to myths!
📊 Finding #1: Nearly 20% of YouTube search results support OUD myths, while 30% oppose.
😰Despite more opposing, myth-supporting content is widespread—and risks shaping how people understand treatment.
📊 Finding #1: Nearly 20% of YouTube search results support OUD myths, while 30% oppose.
😰Despite more opposing, myth-supporting content is widespread—and risks shaping how people understand treatment.
📊 Achieves 0.68-0.86 macro F1 and defers only 5-67% of the examples to the costly LLM.
In practice, MythTriage:
💸 Cuts financial costs by 98% vs experts and by 94% vs LLM labeling
⏱️ Cuts time costs by 96% vs experts & by 76% vs LLM labeling
📊 Achieves 0.68-0.86 macro F1 and defers only 5-67% of the examples to the costly LLM.
In practice, MythTriage:
💸 Cuts financial costs by 98% vs experts and by 94% vs LLM labeling
⏱️ Cuts time costs by 96% vs experts & by 76% vs LLM labeling
👉 Uses lightweight DeBERTa for routine cases
👉 Defers harder ones to GPT-4o (high-performing but costly)
The trick? We distilled DeBERTa on GPT-4o’s synthetic labels—achieving strong performance without massive expert-labeled data.
👉 Uses lightweight DeBERTa for routine cases
👉 Defers harder ones to GPT-4o (high-performing but costly)
The trick? We distilled DeBERTa on GPT-4o’s synthetic labels—achieving strong performance without massive expert-labeled data.
🤖LLMs show promise, but high compute & API costs—especially for long-form video—limit their practicality for large-scale detection.
🤖LLMs show promise, but high compute & API costs—especially for long-form video—limit their practicality for large-scale detection.
✅Validate eight pervasive myths on OUD (see examples below!)
✅Create and refine annotation guidelines
✅Build a gold-standard dataset: 310 videos labeled across 8 myths (~2.5K expert labels).
✅Validate eight pervasive myths on OUD (see examples below!)
✅Create and refine annotation guidelines
✅Build a gold-standard dataset: 310 videos labeled across 8 myths (~2.5K expert labels).
1️⃣ OUD Search Dataset: 2.9K search results
2️⃣ OUD Recs Dataset: 343K video recommendations
1️⃣ OUD Search Dataset: 2.9K search results
2️⃣ OUD Recs Dataset: 343K video recommendations
‼️But myths fuel treatment hesitancy, distrust in healthcare, & stigma.
🤔Understanding the scale of myths is crucial for health officials & platforms to design effective interventions.
‼️But myths fuel treatment hesitancy, distrust in healthcare, & stigma.
🤔Understanding the scale of myths is crucial for health officials & platforms to design effective interventions.