Lightnews — Scholar-powered news

Vilgot Huhn

@vilgothuhn.bsky.social

280 followers 570 following 720 posts

Confused PhD student in psychology at Karolinska Institutet, Stockholm. GAD, ICBT, mechanisms of change. Organizing the ReproducibiliTea JC at KI.
Website: https://vilgot-huhn.github.io/mywebsite/
Personal blog at unconfusion.substack.com

Posts Replies Media Videos

Vilgot Huhn

@vilgothuhn.bsky.social

Yes, that’s why I made it even. I just added the mean out of dumb curiosity and thought it looked quirky

bsky.app/profile/vilg...

Vilgot Huhn @vilgothuhn.bsky.social · 18h

The point of the simulation was to look at how the median p is equal to the alpha level (the red and blue overlapping lines) at 50% power. I just added the mean out of curiosity.

November 26, 2025 at 5:40 AM

Vilgot Huhn

@vilgothuhn.bsky.social

It's not exact so there's probably nothing actually interesting here though.

November 25, 2025 at 7:57 PM

Vilgot Huhn

@vilgothuhn.bsky.social

The point of the simulation was to look at how the median p is equal to the alpha level (the red and blue overlapping lines) at 50% power. I just added the mean out of curiosity.

November 25, 2025 at 7:57 PM

Vilgot Huhn

@vilgothuhn.bsky.social

By all means eyeball the scatterplot. Never skip eyeballing the scatterplot. A lot of scatterplots are visibly weird in ways that means OLS is unsuitable. That’s not what I’m talking about here.

November 25, 2025 at 5:59 PM

Vilgot Huhn

@vilgothuhn.bsky.social

this one's for you @bureaucracynow.bsky.social

November 25, 2025 at 2:27 PM

Vilgot Huhn

@vilgothuhn.bsky.social

Thanks for engaging. I am however landing in the conclusion that the paradox is not itself a justification for interpreting p=0.048 as evidence *for* the null, at least not under normal circumstances. Relatedly, I dug up a response from lakens re that idea on the other site:

x.com/lakens/statu...

x.com

November 25, 2025 at 12:30 PM

Vilgot Huhn

@vilgothuhn.bsky.social

My interpretation of your statement here is that even in a well designed study with decent power, I should interpret p=0.048 as evidence that there is nothing there. Is that a correct reading of what you mean?

bsky.app/profile/stea...

Nick Brown @steamtraen.eu · 3d

But with decent power, p=0.048 is evidence for the null regardless of how well it's designed. (Also in this case it's a 3-way interaction...)

November 23, 2025 at 7:28 PM

Vilgot Huhn

@vilgothuhn.bsky.social

But when one levels this critique that seems to entail specifying an H1 larger than the observed effect, right? (I’m thinking the observed effect implies the observed p-value) so then there has to be some basis for doing that.

November 23, 2025 at 7:25 PM

Vilgot Huhn

@vilgothuhn.bsky.social

In that statement, does H1 refer to an assumed true effect, with an associated power (conditional on n, study design, etc)?
Or does H1 refer to "true effect does not equal zero"?
If it's the second one, I'm learning something new and important cause that is not how I've thought of the paradox.

November 23, 2025 at 6:06 PM

Vilgot Huhn

@vilgothuhn.bsky.social

Yes, as I wrote earlier in the thread, ideally you justify your threshold (but this is usually not done in a thought through way). But the thing I'm stuck on here is interpreting p=0.048 as evidence *for* H = 0. I'm not sure I see how those two issues connect.

November 23, 2025 at 5:11 PM

Vilgot Huhn

@vilgothuhn.bsky.social

Thanks! I'm trying to be humble as I'm just a PhD-student, but honestly I find it a bit hard to reconcile "being interested in estimating a precise meta-analytic SMD" with "not reacting when one of the SMDs is impossible". (maybe they outsourced the data entry to someone else, and then didn't check)

November 23, 2025 at 1:21 PM

Vilgot Huhn

@vilgothuhn.bsky.social

Maybe I should again clarify that I do agree that the case you present here is suspicious, mostly since there are several p-values in that close-to-threshold area at once.

November 23, 2025 at 12:32 PM

Vilgot Huhn

@vilgothuhn.bsky.social

I think it makes sense to ask what sort of effects we care about and whether the test is overpowered for the smallest effect of interest, in that case it makes sense to me to react to a "close to threshold p-value". Also, your point about the p picking up on model violations is 👍👍 6/6

November 23, 2025 at 12:32 PM

Vilgot Huhn

@vilgothuhn.bsky.social

that's different. The scenario @esolomon.bsky.social stipulated was "a well-motivated, well-controlled study". In that scenario I don't think one should circle the p-value for being too close to the threshold. That's removing its role as a threshold. Am I missing something here? 5/6

November 23, 2025 at 12:32 PM

Vilgot Huhn

@vilgothuhn.bsky.social

The second thing is what power means. AFAIK "power" only exists for a given effect. The paradox happens when we look at a thin bin, and the test is high power (e.g. the true effect is large or n is large or combos of these). But if we don't know the true effect and ask simply whether H ≠ 0, 4/6

November 23, 2025 at 12:32 PM

Vilgot Huhn

@vilgothuhn.bsky.social

As far as I've understood, N-P frequentism is ideally about justifying a threshold (which is usually not done to be fair) and then treating that threshold as a threshold. 3/6

November 23, 2025 at 12:32 PM

Vilgot Huhn

@vilgothuhn.bsky.social

I seem to have drawn different conclusions about the practical implications.

I think a few things are happening here that might be junctures to different perspectives.
The first one is how to categorize a p=0.048. Does it belong to the bin 0.04-0.05 or in the bin 0-0.05? 2/6

November 23, 2025 at 12:32 PM

Vilgot Huhn

@vilgothuhn.bsky.social

Thank you for your reply!
I think this is a very conceptually important debate, so I hope it's ok if I'm being a bit persistent about it. I've read Daniel's writings on Lindley's paradox before and while I found it very interesting, 1/6

November 23, 2025 at 12:32 PM

Vilgot Huhn

@vilgothuhn.bsky.social

To be clear, I'm not talking about this particular oxytocin claim, which I would wager is not a thing.

November 22, 2025 at 11:22 PM

Vilgot Huhn

@vilgothuhn.bsky.social

Isn't this pushing it too far? With a good design, if the null happens to be true, you would see values below the threshold rarely. If there was an effect, they'd be more common. How does that turn into evidence for the null (as opposed to non-null, not as opposed to a specific alternative effect) ?

November 22, 2025 at 11:21 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news