Ian Hussey
@ianhussey.mmmdata.io
3.1K followers 820 following 680 posts

Meta-scientist and psychologist. Senior lecturer @unibe.ch‬. Chief recommender @error.reviews. "Jumped up punk who hasn't earned his stripes." All views a product of my learning history.

Psychology 66%
Sociology 10%
Posts Media Videos Starter Packs
Pinned
ianhussey.mmmdata.io
Lego Science is research driven by modular convenience.

When researchers combine methods or concepts, more out of convenience than any deep curiosity in the resulting research question, to create publishable units.

"What role does {my favourite construct} play in {task}?"

ianhussey.mmmdata.io
The issue here is that the version of cohens d used in the original isn’t known, the original authors can’t confirm and don’t have the data, and it doesn’t reproduce from various possible options. So precise replicability against the original study is unknowable at this point.

ianhussey.mmmdata.io
There’s a debate about this. Depends whether you define successful replication as both being significant (common but low bar) or the compatibility of the effect sizes. Even if the latter, Morey and others argue that replication prediction intervals are more appropriate than CI compatibility.

ianhussey.mmmdata.io
The original study did use naturalistic stimuli! But the effort to reward of running it this was wasn’t worth it.

ianhussey.mmmdata.io
We have something like Jake Westfall’s blog about variations in cohens d cooking 😉

Reposted by Ian Hussey

jamiecummins.bsky.social
These results are also worth reiterating the title of @ianhussey.mmmdata.io's recent blog post: if researchers find Cohen's d = 8, no they didn't

mmmdata.io/posts/2025/0...

ianhussey.mmmdata.io
Separately, it’s to help build intuitions for standardized effect sizes and the maximum plausible magnitude. If your therapy is apparently more effective than the preference for chocolate over poop, someone dun screwed up.

ianhussey.mmmdata.io
In part to illustrate this issue with “standardized” effect sizes. This is very much a pedagogical piece for us - if there are this many important choices to be made for a simple preference, imagine how it impacts the quantification of something subtler.

ianhussey.mmmdata.io
If this became the new Scaramucci I would be so happy

ianhussey.mmmdata.io
Not quite yet ... effect size magnitude heavily depends on which outcome measure and which version of Cohen's d is used

ianhussey.mmmdata.io
[measurement and analytic flexibility has entered the chat]
mbeisen.bsky.social
Finally, someone has solved a real problem with AI! No more having to take a paper in the format for a journal that rejected you, and reformat it for a new journal. Well done!! formatmypaper.com

ianhussey.mmmdata.io
And boys do wear skirts, especially in Scotland (kilts), where 70% of the sample is from. They statements are normative, not inviolable truths.

ianhussey.mmmdata.io
"Cats have legs" vs "Fish have legs": Cohens d_rm = 11.95

Evaluations of "love" vs "murder": Cohens d_rm = 11.87

Evaluations of "honest" vs "dishonest": Cohens d_rm = 6.91

ianhussey.mmmdata.io
Other things we included:

Using a pair of extreme attention checks as the comparison, "Please respond with 1" vs "please respond with 7" gives Cohen's d_rm = 51.92. This comes from one of 486 participants not following the instructions and responding with something other than 1 and 7.

ianhussey.mmmdata.io
Our study has three outcome measures: the original study's scale, a single item desirability measure, and a 3 item evaluation measure.

Across all measures, and using both Cohen's d_rm vs d_z, varies the estimate from d = 3.25 to d = 6.20. We'll plan to do a multiverse to model more choices.

ianhussey.mmmdata.io
Of the 21 predictions we received here and on our various Slacks, almost everyone underestimated it. Aaron Friedli, an RA in our lab, predicted it perfectly. @sabrinanorwood.bsky.social came a close second with 6. Only @eikofried.bsky.social overestimated it at 8.
ianhussey.mmmdata.io
Results of the replication are in!

Chocolate is more desirable than poop:

Cohen's d_rm = 6.20, 95%CI [5.63, 6.78]

N = 486, two single item 1-7 Likert scales of desirability.

w/
@jamiecummins.bsky.social
ianhussey.mmmdata.io
Make an effect size prediction!

@jamiecummins.bsky.social and I are replicating Balcetis & Dunning's (2010) "chocolate is more desirable than poop" (Cohen's d = 4.52)

Let us known in the replies what effect size you think we'll find. Details of the study in the thread below.

ianhussey.mmmdata.io
For this prediction, we’ll use cohens d_rm. Would you like to update your prediction?

ianhussey.mmmdata.io
The image specifies it’s a bowl of Lindt chocolates. Care to update?

ianhussey.mmmdata.io
Nice. Will definitely look into this.
steamtraen.eu
Next time an institution tells you how seriously it takes research misconduct, ask them if it's *this* seriously. www.bmj.com/content/297/...
In 1916 the BMJ published an article about the work done by James Shearer, an American physician working in the British Army as a sergeant (because he had no British qualification). He had described a
"delineator" which was better than x rays for portraying gunshot wounds. This caused a sensation and a lot of interest — but on investigation the work was found to have been invented. The BMJ published a retraction, but Shearer was tried by court martial and sentenced to death by firing squad.

ianhussey.mmmdata.io
True. Replication cancelled, everyone go home.

ianhussey.mmmdata.io
Bear in mind the original study was the pretest of stimuli for a main study, so might not be biased this way.

ianhussey.mmmdata.io
The probability that it’s exactly half actually quite low. If we consider the binomial distribution,
psych.peercommunityin.org
PCI Psychology is open for submissions! Did you know that you can easily submit your recommended preprint to any of the 20+ PCI Psych friendly journals? See all friendly journals here: psych.peercommunityin.org/about/pci_fr...
#PsychSciSky #SciPub

ianhussey.mmmdata.io
The outcomes I’m asking for a prediction for are two single items, 1-7 scale:

bsky.app/profile/ianh...
ianhussey.mmmdata.io
Our questions are:
"How desirable is human poop in a toilet bowl?" vs "How desirable is a bowl of chocolates?", with pictures of both, from "Very undesirable" to "Very desirable"
We also have other qs, but this is the one I want predictions for.

ianhussey.mmmdata.io
We have bot checks, attention checks, and a self exclude item. We use prolific’s filter for age 18-100, fluent English, resident in UK or US.