Looking forward to connecting and sharing our work on spin with the CHIL community!
Looking forward to connecting and sharing our work on spin with the CHIL community!
Karen Y.C. Zhang, @ramezkouzy.bsky.social, @ijmarshall.bsky.social, @jessyjli.bsky.social, & @byron.bsky.social [7/7]
Check out our full findings here: arxiv.org/abs/2502.07963
Karen Y.C. Zhang, @ramezkouzy.bsky.social, @ijmarshall.bsky.social, @jessyjli.bsky.social, & @byron.bsky.social [7/7]
Check out our full findings here: arxiv.org/abs/2502.07963
Good news: prompts that encouraged reasoning reduced their tendency to overstate trial results! 🛠️
Careful design is key to improving evidence synthesis for clinical decisions. [6/7]
Good news: prompts that encouraged reasoning reduced their tendency to overstate trial results! 🛠️
Careful design is key to improving evidence synthesis for clinical decisions. [6/7]
Meaning: LLMs believed spun abstracts presented more favorable results! 😬 [4/7]
Meaning: LLMs believed spun abstracts presented more favorable results! 😬 [4/7]
However, things got interesting when we asked LLMs to interpret the results… [3/7]
🔽
However, things got interesting when we asked LLMs to interpret the results… [3/7]
🔽
Spin refers to reporting strategies that make experimental treatments appear more beneficial than they actually are—often distracting from nonsignificant results.
Example:
❌ “The treatment shows a promising trend toward significance…”
✅ “No significant difference was found.”
[2/7]
Spin refers to reporting strategies that make experimental treatments appear more beneficial than they actually are—often distracting from nonsignificant results.
Example:
❌ “The treatment shows a promising trend toward significance…”
✅ “No significant difference was found.”
[2/7]