Donald Szlosek
@dszlosek.bsky.social
860 followers 4.3K following 180 posts
Biostatistician @IDEXX formerly at harvardmed, @BIDMChealth, @nasa. Big data, clinical trials, and medical diagnostics. Mainer. Opinions are my own. he/him
Posts Media Videos Starter Packs
dszlosek.bsky.social
Still one of my favorite blog posts: simplystatistics.org/posts/2019-0...

#datascience #StatsSky #statistics #Academics
dszlosek.bsky.social
I feel like plotting the histogram of the error, squared error and absolute error is so much more informative then viewing the MSE or the RMSE.
dszlosek.bsky.social
I completely agree! Actually, a few days ago there was a debate between Frank Harrell and Judea Pearl on study design issues in causal inference that was interesting. But part of the anti-causal community on there is....interesting
dszlosek.bsky.social
Yeah that makes sense. Just because you "can" in this very simple scenario/ use case, doesn't mean you should/would. You lose generalizability and simplicity!
dszlosek.bsky.social
The first #rstats conference (Nor'EastR) I ever went too I sat next to three friendly strangers who opened my eyes to the wonderful R community. Those people were @jdlong.cerebralmastication.com Kirk Mettler and David Smith! Shame to see Revolution Analytics down!
ivelasq3.bsky.social
Looks like the Revolutions Analytics blog is down 😔 Used to be on my regular bookmarks rotation in the early 2010's. End of an #RStats era.

blog.revolutionanalytics.com
Typepad - Closed for Business
blog.revolutionanalytics.com
dszlosek.bsky.social
Thinking about joint probability regions
(like 0<y<x−1) and how we compute their probabilities via double integrals.

In principle, Green’s theorem could turn that into a line integral around the boundary — so why don’t we ever do that? Loss of probabilistic meaning, or just unnecessary machinery?
Reposted by Donald Szlosek
statsepi.bsky.social
I read and write, I explore and I question, I design and script and analyse, I interpret and communicate. I do this to train my mind in the hopes of one day generating new knowledge. New knowledge that might even be useful, and that no algorithm can yet be trained on.
hormiga.bsky.social
Y'all. I just got ChatGPT to do everything in R for this manuscript. I mean EVERYTHING. And it's all legit and reproducible. I'm shook.

How are we mentoring our trainees in statistics now? Who needs to learn coding in R line by line, and who doesn't?

scienceforeveryone.science/statistics-i...
Statistics in the era of AI
How do we mentor, teach, and do stats when AI can do so much of the work?
scienceforeveryone.science
Reposted by Donald Szlosek
datavisfriendly.bsky.social
#rstats #dataviz
A "line-up" test has been proposed as a human significance test: Can an observer spot a difference that rejects a null hypothesis?

Here's glyphs for 20 penguins, representing the main variables with visual features.

There are THREE multivariate outliers here. CAN YOU FIND THEM?
Reposted by Donald Szlosek
rconsortium.bsky.social
Coming up next month, register now!

R+AI 2025 - Nov 12-13

Keynote: Joe Cheng, CTO @ Posit

Talk: “Keeping LLMs in Their Lane: Focused AI for Data Science and Research”

Register now!
rconsortium.github.io/RplusAI_webs...

#rstats #AI #DataScience
@posit.co @jcheng5.bsky.social
Joe Cheng, Posit, CTO - headshot
Reposted by Donald Szlosek
pwgtennant.bsky.social
Just because an LLM can produce a report with various figures & charts doesn't mean it is good at statistics.

Because good statistics is not about producing code.

It's about deep knowledge of study design & conduct. In my opinion, 95% of all data science problems come from poor questions & design.
hormiga.bsky.social
Y'all. I just got ChatGPT to do everything in R for this manuscript. I mean EVERYTHING. And it's all legit and reproducible. I'm shook.

How are we mentoring our trainees in statistics now? Who needs to learn coding in R line by line, and who doesn't?

scienceforeveryone.science/statistics-i...
Statistics in the era of AI
How do we mentor, teach, and do stats when AI can do so much of the work?
scienceforeveryone.science
Reposted by Donald Szlosek
tslumley.bsky.social
Oh look. X is strongly correlated with rank(X)
#AxesOfEvil
Reposted by Donald Szlosek
Reposted by Donald Szlosek
hormiga.bsky.social
Y'all. I just got ChatGPT to do everything in R for this manuscript. I mean EVERYTHING. And it's all legit and reproducible. I'm shook.

How are we mentoring our trainees in statistics now? Who needs to learn coding in R line by line, and who doesn't?

scienceforeveryone.science/statistics-i...
Statistics in the era of AI
How do we mentor, teach, and do stats when AI can do so much of the work?
scienceforeveryone.science
Reposted by Donald Szlosek
jessicahullman.bsky.social
Think of how much better off we'd be if every established researcher got in the habit of writing papers entitled "Second thoughts on [thing I'm famous for]"
dszlosek.bsky.social
Excellent piece by Miryam Naddaf discussing a surge in papers that likely use LLMs on open data. This is concerning since I work on one of those with some of these datasets (NHANES, CDC WONDERS, BRFSS). www.nature.com/articles/d41...
Reposted by Donald Szlosek
statsepi.bsky.social
An 8 year-old blog post on causal thinking in epidemiology that I'm sharing for no particular reason (ICYMI).

darrendahly.github.io/post/2017-02...
Cause vs. Consequence |
Principal Statistician | Senior Lecturer
darrendahly.github.io
dszlosek.bsky.social
#academics #AcademicSky
dszlosek.bsky.social
Excellent advice on paper review:

1. Peer reviewers are volunteers.

2. Map all comments to actions.

3. Address all comments.

4. Focus on improving your paper, instead of arguing.

5. Rarely, and only with strong defense, say no.

6. Don’t take things personally.

7. Avoid recreational revisions.
dszlosek.bsky.social
S-Values are much more interpretable than P-values, yet adoption seems near impossible. I wonder what it would take to make the leap? #statssky #episky #rstats #statistics
Reposted by Donald Szlosek
tylermw.com
"Man, I really wish RStudio respected hierarchy in code-folded section headers... I wonder how easy it would be to..."

(inner voice: DON'T DO IT! IT'S NOT WORTH IT! JUST GET BACK TO WORK! THE YAK IS BEST LEFT UNSHORN!)

"... I'm gonna do it."

#RStats #RStudio
Reposted by Donald Szlosek
georgiatomova.bsky.social
We should do a study on how much of the funded applied research suffers from problems that the unfunded methods research could have helped prevent or resolve
dszlosek.bsky.social
my personal favorite seed set up i've is set.seed(666) # \m/ rock on #rstats #databs any others out there?
swampthingpaul.bsky.social
While digging through some code from a manuscript I recently read ... yes, that rabbit hole I came across this line and I think I just found my new favorite set.seed(...) 🤣

set.seed(i+42) # Don’t Panic. “What is the meaning of life, the universe, and everything?”

#Rstats