Lightnews — Scholar-powered news

Todd Johnson

@johnsontoddr.bsky.social

When papers are published, even in major medical journals, they are almost never published with the code used to retrieve or analyze the data and you can rarely even get the data. I know from reviewing student work, that it can take months to vet an ad hoc pipeline.

November 10, 2025 at 1:52 PM

Todd Johnson

@johnsontoddr.bsky.social

I am struggling to assemble a systematic pipeline for observational studies using routinely collected clinical data. The data we start with is a mess and I can only find general suggestions from data science to begin to vet and clean the data. Cleaning decisions affect later analysis and results.

November 10, 2025 at 1:51 PM

Todd Johnson

@johnsontoddr.bsky.social

Yikes. That's bad.

November 8, 2025 at 2:38 PM

Todd Johnson

@johnsontoddr.bsky.social

www.youtube.com/watch?v=gk9-...

The McCormick's Don't Have A Can Opener

YouTube video by Yours Daily

www.youtube.com

November 8, 2025 at 2:30 PM

Todd Johnson

@johnsontoddr.bsky.social

My parents are on social security and get food from a local food bank. And yes, they voted for this. They have going further and further down the Fox News rabbit hole for the past 20 years.

November 8, 2025 at 2:28 PM

Todd Johnson

@johnsontoddr.bsky.social

Evidence that 4 year olds are better than Fox News viewers ;-)

November 6, 2025 at 5:37 AM

Todd Johnson

@johnsontoddr.bsky.social

We have to validate all of the output, but that is easier than filling in the tables ourselves, since most of the extraction was correct.

October 27, 2025 at 9:50 PM

Todd Johnson

@johnsontoddr.bsky.social

We gave it a blank table with column and row names and a paper or link to the registry and had it fill the table in. In one case, the paper was a simulation, not an RCT/TTE so Claude changed the table to fit a simulation study and extracted related elements.

October 27, 2025 at 9:50 PM

Todd Johnson

@johnsontoddr.bsky.social

A PhD student and I are currently finishing up an analysis in which we had Claude 3.7 extract elements of RCTs and matching TTEs from papers and the clinical trial registry. Claude was nearly perfect except for some edge conditions and confusion over dates. It was zero shot learning.

October 27, 2025 at 9:48 PM

Todd Johnson

@johnsontoddr.bsky.social

Until Claude 3.5 (12/24 for me) I saw similar results. LLMs were largely useless. In many cases Claude is still better than GPT, even the more expensive models. Copilot was very stripped down. I would not have expected much from it.

October 27, 2025 at 9:46 PM

Todd Johnson

@johnsontoddr.bsky.social

I found the same thing a year ago. That changed with Claude 3.5. 4.5 is even better. What kinds of questions are you asking? It might be domain-specific.

October 27, 2025 at 1:46 PM

Todd Johnson

@johnsontoddr.bsky.social

I have been learning about causal inference and doing informatics work in it for several years. The chat above came after all of that and then hours reading primary sources. Note the use of critical thinking (on my part) throughout the chat.

October 27, 2025 at 4:59 AM

Todd Johnson

@johnsontoddr.bsky.social

The point of showing these chats and LLM output is to demonstrate how LLMs can help humans better understand complex topics and also encourage critical thinking. There are reasons to be concerned about AI, but they don't have destroy education or critical thinking.

October 27, 2025 at 4:57 AM

Todd Johnson

@johnsontoddr.bsky.social

It is going to take quite some time and reading of other primary sources to turn this into a full blown teaching tutorial. But the LLM really helped me understand issues beyond what I read from authoritative sources.

October 27, 2025 at 4:54 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news