Lightnews — Scholar-powered news

Donald Szlosek @dszlosek.bsky.social · 11h

Still one of my favorite blog posts: simplystatistics.org/posts/2019-0...

#datascience #StatsSky #statistics #Academics

1 8

Donald Szlosek @dszlosek.bsky.social · 14h

I feel like plotting the histogram of the error, squared error and absolute error is so much more informative then viewing the MSE or the RMSE.

3

Donald Szlosek @dszlosek.bsky.social · 14h

I completely agree! Actually, a few days ago there was a debate between Frank Harrell and Judea Pearl on study design issues in causal inference that was interesting. But part of the anti-causal community on there is....interesting

1 2

Donald Szlosek @dszlosek.bsky.social · 15h

Thank you!

Donald Szlosek @dszlosek.bsky.social · 1d

Yeah that makes sense. Just because you "can" in this very simple scenario/ use case, doesn't mean you should/would. You lose generalizability and simplicity!

Donald Szlosek @dszlosek.bsky.social · 1d

The first #rstats conference (Nor'EastR) I ever went too I sat next to three friendly strangers who opened my eyes to the wonderful R community. Those people were @jdlong.cerebralmastication.com Kirk Mettler and David Smith! Shame to see Revolution Analytics down!

Isabella Velásquez @ivelasq3.bsky.social · 1d

Looks like the Revolutions Analytics blog is down 😔 Used to be on my regular bookmarks rotation in the early 2010's. End of an #RStats era.

blog.revolutionanalytics.com

Typepad - Closed for Business

blog.revolutionanalytics.com

3 2 7

Donald Szlosek @dszlosek.bsky.social · 3d

Thinking about joint probability regions
(like 0<y<x−1) and how we compute their probabilities via double integrals.

In principle, Green’s theorem could turn that into a line integral around the boundary — so why don’t we ever do that? Loss of probabilistic meaning, or just unnecessary machinery?

3 1 4

Reposted by Donald Szlosek

Darren Dahly @statsepi.bsky.social · 3d

I read and write, I explore and I question, I design and script and analyse, I interpret and communicate. I do this to train my mind in the hopes of one day generating new knowledge. New knowledge that might even be useful, and that no algorithm can yet be trained on.

Terry McGlynn @hormiga.bsky.social · 7d

Y'all. I just got ChatGPT to do everything in R for this manuscript. I mean EVERYTHING. And it's all legit and reproducible. I'm shook.

How are we mentoring our trainees in statistics now? Who needs to learn coding in R line by line, and who doesn't?

scienceforeveryone.science/statistics-i...

Statistics in the era of AI

How do we mentor, teach, and do stats when AI can do so much of the work?

scienceforeveryone.science

4 17 81

Reposted by Donald Szlosek

Michael Friendly @datavisfriendly.bsky.social · 3d

#rstats #dataviz
A "line-up" test has been proposed as a human significance test: Can an observer spot a difference that rejects a null hypothesis?

Here's glyphs for 20 penguins, representing the main variables with visual features.

There are THREE multivariate outliers here. CAN YOU FIND THEM?

4 3 8

Reposted by Donald Szlosek

R Consortium @rconsortium.bsky.social · 3d

Coming up next month, register now!

R+AI 2025 - Nov 12-13

Keynote: Joe Cheng, CTO @ Posit

Talk: “Keeping LLMs in Their Lane: Focused AI for Data Science and Research”

Register now!
rconsortium.github.io/RplusAI_webs...

#rstats #AI #DataScience
@posit.co @jcheng5.bsky.social

2 8 16

Reposted by Donald Szlosek

Peter Tennant @pwgtennant.bsky.social · 6d

Just because an LLM can produce a report with various figures & charts doesn't mean it is good at statistics.

Because good statistics is not about producing code.

It's about deep knowledge of study design & conduct. In my opinion, 95% of all data science problems come from poor questions & design.

Terry McGlynn @hormiga.bsky.social · 7d

Y'all. I just got ChatGPT to do everything in R for this manuscript. I mean EVERYTHING. And it's all legit and reproducible. I'm shook.

How are we mentoring our trainees in statistics now? Who needs to learn coding in R line by line, and who doesn't?

scienceforeveryone.science/statistics-i...

Statistics in the era of AI

How do we mentor, teach, and do stats when AI can do so much of the work?

scienceforeveryone.science

13 24 110

Reposted by Donald Szlosek

Michael Gicheru @michaelgicheru.bsky.social · 5d

Got a 10 minute video for you. The dots really clicked for me after watching this.

youtu.be/oIMFZf5dUFA?...

Demystifying . . . (dots): R package dev fundamentals

YouTube video by Josiah Parry

youtu.be

1 3 11

Reposted by Donald Szlosek

Thomas Lumley @tslumley.bsky.social · 5d

Oh look. X is strongly correlated with rank(X)
#AxesOfEvil

3 13

Reposted by Donald Szlosek

econmaett @econmaett.github.io · 8d

@stephenjwild.bsky.social I want more silly #stats words

arxiv.org/abs/2503.22333

5 2 23

Donald Szlosek @dszlosek.bsky.social · 4d

This comment reminds me of an excellent post by @thosenerdygirls.bsky.social thosenerdygirls.substack.com/p/dont-get-f...

Reposted by Donald Szlosek

Terry McGlynn @hormiga.bsky.social · 7d

Y'all. I just got ChatGPT to do everything in R for this manuscript. I mean EVERYTHING. And it's all legit and reproducible. I'm shook.

How are we mentoring our trainees in statistics now? Who needs to learn coding in R line by line, and who doesn't?

scienceforeveryone.science/statistics-i...

Statistics in the era of AI

How do we mentor, teach, and do stats when AI can do so much of the work?

scienceforeveryone.science

32 29 120

Reposted by Donald Szlosek

Jessica Hullman @jessicahullman.bsky.social · 5d

Think of how much better off we'd be if every established researcher got in the habit of writing papers entitled "Second thoughts on [thing I'm famous for]"

4 13 110

Donald Szlosek @dszlosek.bsky.social · 5d

Excellent piece by Miryam Naddaf discussing a surge in papers that likely use LLMs on open data. This is concerning since I work on one of those with some of these datasets (NHANES, CDC WONDERS, BRFSS). www.nature.com/articles/d41...

1 2

Reposted by Donald Szlosek

Darren Dahly @statsepi.bsky.social · 8d

An 8 year-old blog post on causal thinking in epidemiology that I'm sharing for no particular reason (ICYMI).

darrendahly.github.io/post/2017-02...

Cause vs. Consequence |

Principal Statistician | Senior Lecturer

darrendahly.github.io

5 11

Donald Szlosek @dszlosek.bsky.social · 5d

#academics #AcademicSky

Donald Szlosek @dszlosek.bsky.social · 5d

Excellent advice on paper review:

1. Peer reviewers are volunteers.

2. Map all comments to actions.

3. Address all comments.

4. Focus on improving your paper, instead of arguing.

5. Rarely, and only with strong defense, say no.

6. Don’t take things personally.

7. Avoid recreational revisions.

3 9 20

Donald Szlosek @dszlosek.bsky.social · 5d

S-Values are much more interpretable than P-values, yet adoption seems near impossible. I wonder what it would take to make the leap? #statssky #episky #rstats #statistics

4 7 23

Reposted by Donald Szlosek

Tyler Morgan-Wall @tylermw.com · 5d

"Man, I really wish RStudio respected hierarchy in code-folded section headers... I wonder how easy it would be to..."

(inner voice: DON'T DO IT! IT'S NOT WORTH IT! JUST GET BACK TO WORK! THE YAK IS BEST LEFT UNSHORN!)

"... I'm gonna do it."

#RStats #RStudio

6 7 68

Reposted by Donald Szlosek

Georgia Tomova @georgiatomova.bsky.social · 6d

We should do a study on how much of the funded applied research suffers from problems that the unfunded methods research could have helped prevent or resolve

2 5 19

Donald Szlosek @dszlosek.bsky.social · 6d

my personal favorite seed set up i've is set.seed(666) # \m/ rock on #rstats #databs any others out there?

Paul Julian @swampthingpaul.bsky.social · 6d

While digging through some code from a manuscript I recently read ... yes, that rabbit hole I came across this line and I think I just found my new favorite set.seed(...) 🤣

set.seed(i+42) # Don’t Panic. “What is the meaning of life, the universe, and everything?”

#Rstats

1