Adrian Olszewski
banner
adrianolsz.bsky.social
Adrian Olszewski
@adrianolsz.bsky.social
Biostatistician and statistical R programmer in Clinical Trials ⦿ Having fun in a 100% R-based CRO ⦿ Frequentist framework ⦿ 𝘭𝘰𝘨𝘪𝘴𝘵𝘪𝘤 𝘳𝘦𝘨𝘳𝘦𝘴𝘴𝘪𝘰𝘯 𝘩𝘢𝘴 𝘣𝘦𝘦𝘯 𝘢 𝘳𝘦𝘨𝘳𝘦𝘴𝘴𝘪𝘰𝘯 𝘴𝘪𝘯𝘤𝘦 𝘪𝘵𝘴 𝘣𝘪𝘳𝘵𝘩
#stats Sometimes, at work, I feel like I'm the only person who cares about the success of the study...
March 1, 2025 at 4:09 PM
That's how it sometimes looks like when #stats (here: in clinical #biostatistics) meets #machinelearning with a little 𝘪𝘮𝘱𝘦𝘥𝘢𝘯𝘤𝘦 𝘮𝘪𝘴𝘮𝘢𝘵𝘤𝘩 on some topic common in both places 🙃 (happened to me twice so far).
January 29, 2025 at 2:34 AM
- Adrian, logistic regression is NOT a regression, it's a classifier!
- O'rly? I've never used it for classification while using it daily at work to test hypotheses and explore treatment effects😬 I bet #stats guys who invented it didn't know this... revelation too 🙄
#ml #machinelearning #datascience
December 30, 2024 at 2:04 PM
The moment, when you feel the nonsense of 𝙋𝙊𝙎𝙏-𝙃𝙊𝘾 𝙋𝙊𝙒𝙀𝙍 and yet the same concept is essential for👉 𝙙𝙚𝙨𝙞𝙜𝙣𝙞𝙣𝙜 𝙨𝙩𝙪𝙙𝙞𝙚𝙨, namely calculating the sample size:
1) to determine the safety range for dropouts
2) when there's no closed-form formula (like for the Fisher exact test)
#statistics #stats #rstats
December 18, 2024 at 4:51 PM
#statistics #stats Reading QQ plots is not easy at first. I can relate! But don't give up just because it's hard. Everything is hard until you know it and suddenly it becomes your daily tool. Practising and mastering QQ plots (and CDF, BTW) isn't a waste of time! Below I show a few examples.
December 14, 2024 at 12:35 PM
#rstats #ggplot2 I must say I really *like* the gghalves package and the idea of the #raincloud plots 🥰
In my toolkit (#clinicaltrials) for 8 years. Below - various examples (from toy experiments and real reports). Luckily to me, my collaborators love them too (and even demand!).
December 10, 2024 at 1:31 PM
Anscombe's quartet, Datasaurus dozen... I wanted to make my own set with same (basic) descriptive summaries. Let me introduce the Olszewski's quartet 😎
Yeah, I could easily make both kurtosis and skewness equal too, but I like the asymmetry of the nose and eyes 😂
github.com/adrianolszew...
#rstats
December 10, 2024 at 3:24 AM
For ~15yrs on ~weekly basis at work you've been using the logistic regression for exploring main & simple effects + interactions & testing hypotheses, and suddenly a #ML guru tells you that "nope, logistic regression is not a regression, it gives 0/1😎". And pufff! - you disappear like in Disney's💥🤦‍♂️
December 5, 2024 at 7:34 AM
PS: I thought you might find this orange book interesting. It was a big eye-opener to me years ago. The cover in the figure from my post on the LinkedIn, where I have a series on the rank-based methods, like MWW, KW (both special cases of ordinal logistic regression), Brunner-Munzel, ATS/WTS, ART.
December 5, 2024 at 1:40 AM
Testing for stochastic superiority means that they will be sensitive to anything that "induces" the superiority. Effectively it translates also to dispersions (only in some patterns, as you suggested) and shapes. But far too many people (ab)use them for comparing medians naively (only holds for IID)
December 5, 2024 at 12:40 AM
I'm totally and happily! surprised. I thought I'm the only one in the whole Universe who's been doing
𝗱𝗮𝘁𝗮 %>%
𝗰𝗵𝗮𝗶𝗻_𝗼𝗳_𝗰𝗮𝗹𝗹𝘀() %>%
𝗺𝗼𝗿𝗲_𝗰𝗮𝗹𝗹𝘀() -> 𝗿𝗲𝘀𝘂𝗹𝘁
for over a decade. Yet most of the time, reactions of those who saw it varied between "you weirdo" and 🤮. Thanks for making my day!
December 4, 2024 at 3:12 PM
Worth remembering: rank-based tests like Kruskal-Wallis, Mann-Whitney (-Wilcoxon), ATS/WTS - do NOT compare medians unless IID holds (=same shape & dispersion). Otherwise, stochastic superiority is assessed. Use quantile regression (+ Wald's testing) or Brown-Mood instead.
#statistics #datascience
December 4, 2024 at 10:21 AM
Mee too, since I found this quote. PS: but I'm guilty of another crime - for more than decade I've been using the right arrow with pipes 😅 (data %>% chain_of_verbs %>% more_verbs -> result).
December 4, 2024 at 5:18 AM
A very important remark! This is essentially the idea of combining statistical discernibility and practical relevance, like in non-inferiority, equivalence and superiority (aka "minimum meaningful magnitude") studies (common in #clinicaltrials)
December 4, 2024 at 3:38 AM
This package is so underrated, while exposing a rich collection of awesome "hacks"! For example, I can control "breaks" for Y axis in each panel independently. Multiple colour scales, point-paths, advanced guides + axis features (eg nesting), stat_theodensity() working with faceting and many more!
December 4, 2024 at 3:24 AM