Martin Saveski
msaveski.bsky.social
Martin Saveski
@msaveski.bsky.social
Assistant Professor @ UW iSchool. Interested in computational social science, social networks & causal inference.
http://martinsaveski.com
Re SUTVA: My experience doing empirical and methodological work on interference (e.g., doi.org/10.1145/3097...) has kept me humble when trying to predict total treatment effects.
December 2, 2025 at 7:38 PM
(Also, that's where we got the idea to contextualize with historical change.)
December 2, 2025 at 7:38 PM
We actually did a thorough lit review when doing the power analysis and, if you look closely, there aren’t many experiments that used the same outcome to compare with. My best reference point is the excellent paper by Santoro & @dbroockman.bsky.social : doi.org/10.1126/scia...
The promise and pitfalls of cross-partisan conversations for reducing affective polarization: Evidence from randomized experiments
Cross-partisan conversations can reduce affective polarization, but effects do not persist long-term or spill over.
doi.org
December 2, 2025 at 7:38 PM
That’s why we tried to contextualize the results in terms of historical change in the metric.
December 2, 2025 at 7:38 PM
Well, they ask whether a 2-degree change is a small effect, and I think it’s a reasonable question. I’ve discussed this with quite a few people who have done extensive empirical work in this area and whose opinions I value. For some, 2 degrees is small; for others, it’s huge ...
December 2, 2025 at 7:38 PM
Thank you! I have been meaning to send all of your a note for while, but I can't overstate how helpful your Green Lab SOP was in analyzing the data!
December 2, 2025 at 7:15 PM
Thanks for the shoutout! Obviously many possible reasons for the differences but my best guess is (i) content vs. user level intervention (i.e., reranking content likely to polarize) and (ii) much higher prevalence of political content on X (32% on X vs. 13.4% on FB). Curious to hear your thoughts.
December 1, 2025 at 5:16 PM
Finally, in this work, we focused on affective polarization, but our framework for LLM-based feed ranking is general and can be applied to other outcomes, including well-being, mental health, and civic engagement.

/fin
December 1, 2025 at 7:59 AM
We hope that other researchers will use our methodology to run experiments that are longer, span multiple platforms, and extend beyond the US.

/13
December 1, 2025 at 7:59 AM
Important limitations to keep in mind: (i) this was a 10-day experiment, (ii) run on a single platform, and (iii) during a politically charged time.

/12
December 1, 2025 at 7:59 AM
Increasing exposure to AAPA didn’t lead to any detectable effects on engagement, likely because we reranked far fewer posts.

/11
December 1, 2025 at 7:59 AM
Reducing exposure to AAPA led to a decrease in engagement in absolute terms: less time spent, less posts viewed, and liked. However, among the posts that the participants viewed, they liked them at a significantly higher rate.

/10
December 1, 2025 at 7:59 AM
Decreasing exposure to AAPA made participants less angry and sad in the moment while increasing exposure had the opposite effect. The reranking didn’t have any effect on calm and excitement.

/9
December 1, 2025 at 7:59 AM
While the effects are symmetric, it’s worth noting that we upranked a few APAA posts and downranked all AAPA posts in the corresponding conditions.

/8
December 1, 2025 at 7:59 AM
In a field experiment with 1,256 consenting participants, we found that downranking AAPA posts leads to a decrease and upranking to an increase in affective polarization of 2 degrees on the 0-100 out-party feeling thermometer.

/7
December 1, 2025 at 7:59 AM
There are many reasonable ways to define “polarizing content.” We focused on antidemocratic attitudes and partisan animosity (AAPA), drawing on the eight factors defined in the excellent study by Voelkel et al.

doi.org/10.1126/scie...

/6
December 1, 2025 at 7:59 AM
In contrast to previous work that intervened at the level of users (e.g., downranking in-party content) or platform affordances (e.g., switching to a chronological feed), we intervened at the content level, exploiting recent advances in NLP.

/5
December 1, 2025 at 7:59 AM
We used this method to test how reranking content that is likely to polarize affects participants’ affective polarization and emotions.

/4
December 1, 2025 at 7:59 AM
The extension intercepts the user’s feed, uses an LLM to score and rerank the posts, and displays the updated feed.

Making this process fast took a lot of @tiziano.bsky.social’s engineering wizardry.

Source Code: github.com/StanfordHCI/...
Technical report: arxiv.org/abs/2406.19571

/3
December 1, 2025 at 7:59 AM
Feed algorithms impact our lives, but until now only the platforms could run experiments testing the causal effects of different design choices. We propose a possible solution to this challenge by deploying a field experiment on X using a browser extension.

/2
December 1, 2025 at 7:59 AM
December 1, 2025 at 7:59 AM
The Louisiana museum is really nice. Short train ride but worth it.
November 30, 2025 at 11:30 PM