Jamie Cummins
jamiecummins.bsky.social
Jamie Cummins
@jamiecummins.bsky.social
Currently a visiting researcher at Uni of Oxford. Normally at Uni of Bern.
Meta-scientist building tools to help other scientists. NLP, simulation, & LLMs.
Creator and developer of RegCheck (https://regcheck.app).
1/4 of @error.reviews.
🇮🇪
Pinned
Introducing RegCheck: a tool which uses Large Language Models to automatically compare preregistered protocols with their corresponding published papers and highlights deviations.

@malte.the100.ci @ianhussey.bsky.social @ruben.the100.ci @bjoernhommel.bsky.social

regcheck.app
RegCheck.app
RegCheck is an AI tool to compare preregistrations with papers instantly.
regcheck.app
Reposted by Jamie Cummins
Congratulations to @simine.com for winning the Einstein Foundation Individual Award! 🎉

A well-deserved recognition for her seminal efforts to improve scientific rigor, which includes instituting detailed checks for errors and computational reproducibility at Psychological Science.
November 24, 2025 at 1:21 PM
Reposted by Jamie Cummins
🏆 Individual: @simine.com, psychologist at @unimelb.bsky.social & editor-in-chief of Psychological Science, is recognized for pioneering methodological rigor, reproducibility & collaborative research, driving initiatives such as @improvingpsych.org & the journal Collabra @ucpress.bsky.social. (2/5)
November 24, 2025 at 10:00 AM
Reposted by Jamie Cummins
Next: Jack Wilkinson @jdwilko.bsky.social with 'Problematic clinical trials and the threat to evidence synthesis'
Systematic reviews are considered the cornerstone of medicine. But some of the eligible trials that could be included might be problematic. They could get included.
#IRICSydney
November 17, 2025 at 10:30 PM
Reposted by Jamie Cummins
I think this is an overly pessimistic take from the @bmj.com.

Sharing data does not inherently increase trust, rather it enables verification which allows for trust calibration.

This example is a win. Serious issues were rapidly detected that would not have been without mandatory data sharing.
November 14, 2025 at 8:18 PM
With every LLM since GPT-4, I've tried a game: ask it to commit a 20 Questions guess to a cipher, we play 20 Questions, and then we see if what it claims to have been its original choice is consistent with its cipher.

ChatGPT-5.1 Thinking is the first model to do this successfully!
November 14, 2025 at 3:44 PM
Reposted by Jamie Cummins
Synchronous Robustness Reports could explore implications of different analytical choices – but they could still suffer from bias. Hardwicke argues that preregistration is crucial to prevent it.

@tomhardwicke.bsky.social
Risk of bias in robustness reports: https://osf.io/wj26e
November 14, 2025 at 2:54 PM
Reposted by Jamie Cummins
Are methodological and causal inference errors creating a false impression that the gut microbiome causes autism? In this strong analysis, Mitchell, Dahly, and Bishop question the evidence.

They show that triangulation in science requires multiple robust lines of research.
The link between the gut #microbiome and autism is not backed by science, researchers say.

Read the full opinion piece in @cp-neuron.bsky.social: spkl.io/63322AbxpA

@wiringthebrain.bsky.social, @statsepi.bsky.social, & @deevybee.bsky.social
November 14, 2025 at 12:49 PM
Reposted by Jamie Cummins
Yes, like a Netflix documentary included IN EVERY SOCIAL PSYCHOLOGY TEXTBOOK
I think many people don't realise that "When Prophecy Fails" is not an experimental study, but a work of history. Its like finding out that a popular Netflix documentary is fake. Bad but does not change science. Social psychology is not based on this book in any way.
There’s growing evidence that something was going seriously wrong in the classic early work on cognitive dissonance

Latest revelation: The story in When Prophecy Fails seems to have been fabricated in the most egregious way

But this is not the only one…

onlinelibrary.wiley.com/doi/abs/10.1...
November 13, 2025 at 4:11 PM
Reposted by Jamie Cummins
There is a lot of fuss today over whether chatbots can replace human participants in social sciences research when the solution is obvious: ask chatbots to simulate the views of social scientists and survey them on attitudes towards chatbots as substitutes for human subjects.
November 10, 2025 at 10:45 PM
Reposted by Jamie Cummins
Delighted to support MU Psych Soc's invited lecture on Forensic Metascience by departmental alum, Dr Jamie Cummins @jamiecummins.bsky.social whose work in this area seeks to enhance rigour & accuracy in scientific reporting.

Sincere thanks to Dr Cummins. #MUPsychologyAt25
November 7, 2025 at 12:36 PM
Reposted by Jamie Cummins
LLMs are now widely used in social science as stand-ins for humans—assuming they can produce realistic, human-like text

But... can they? We don’t actually know.

In our new study, we develop a Computational Turing Test.

And our findings are striking:
LLMs may be far less human-like than we think.🧵
Computational Turing Test Reveals Systematic Differences Between Human and AI Language
Large language models (LLMs) are increasingly used in the social sciences to simulate human behavior, based on the assumption that they can generate realistic, human-like text. Yet this assumption rem...
arxiv.org
November 7, 2025 at 11:13 AM
It was such an honour and privilege to be back at my alma mater 9 years (!!!) after finishing my undergraduate degree to give a talk as part of psych department's 25 year anniversary!
November 7, 2025 at 10:58 AM
Reposted by Jamie Cummins
Lovely to welcome back Dr @jamiecummins.bsky.social for tonight's @mupsychology.bsky.social talk as part of our #MUpsychologyAt25 events @maynoothuniversity.ie
November 6, 2025 at 6:48 PM
My master thesis file name on my old university's thesis archive site still makes me chuckle.
October 30, 2025 at 12:21 PM
Reposted by Jamie Cummins
This year Demis Hassabis predicted AI could cure all disease in a decade.

But other scientists like Claus Wilke & Derek Lowe say biology is far more complex, or progress will be limited by clinical trials & economics.

In a new 4hr podcast episode of *Hard Drugs*, we answer: Will AI solve medicine?
Will AI solve medicine?
spotify.link
October 29, 2025 at 2:11 PM
Reposted by Jamie Cummins
I built a DAG diagram with garden hoses for teaching.
Pictured: a collider bias diagram, inspired by a blocked pipe situation I experienced (which I credit with giving me the intuition though it also ruined my belongings in the flooded cellar).
October 28, 2025 at 5:50 PM
Reposted by Jamie Cummins
October 27, 2025 at 3:15 PM
Reposted by Jamie Cummins
The 2011 Presidential Debate where Sean Gallagher loses the election
part 1 #aras25
October 22, 2025 at 11:32 AM
Reposted by Jamie Cummins
Can AI simulations of human research participants advance cognitive science? In @cp-trendscognsci.bsky.social, @lmesseri.bsky.social & I analyze this vision. We show how “AI Surrogates” entrench practices that limit the generalizability of cognitive science while aspiring to do the opposite. 1/
AI Surrogates and illusions of generalizability in cognitive science
Recent advances in artificial intelligence (AI) have generated enthusiasm for using AI simulations of human research participants to generate new know…
www.sciencedirect.com
October 21, 2025 at 8:24 PM
Reposted by Jamie Cummins
New hobby:

Remaking article abstracts as movie trailers to expose hype and fearmongering.
October 20, 2025 at 10:22 AM
Reposted by Jamie Cummins
"Silicon samples" - using LLMs to generate fake survey responses instead of recruiting humans. Sounds efficient until you realize small model tweaks completely flip your results. Shortcuts in research usually aren't.
The threat of analytic flexibility in using large language models to simulate human data: A call to attention
Social scientists are now using large language models to create "silicon samples" - synthetic datasets intended to stand in for human respondents, aimed at revolutionising human subjects research.…
arxiv.org
October 9, 2025 at 1:08 PM
Reposted by Jamie Cummins
Psychologists running empirical studies to rediscover engineering design choices is such a strange genre of papers. By all means, run studies on LLM judgments -- but what else than lexical co-occurence and statistical priors would they be based on??
Evidence that even when LLMs produce similar results to humans, they “rely on lexical associations and statistical priors rather than contextual reasoning or normative criteria. We term this divergence epistemia: the illusion of knowledge emerging when surface plausibility replaces verification”
PNAS
Proceedings of the National Academy of Sciences (PNAS), a peer reviewed journal of the National Academy of Sciences (NAS) - an authoritative source of high-impact, original research that broadly spans...
www.pnas.org
October 17, 2025 at 10:59 AM
Reposted by Jamie Cummins
New episode of Hard Drugs!

What if you could design a protein never seen in nature?

Scientists are using new AI tools like RFDiffusion, AlphaFold & ProteinMPNN to hallucinate novel proteins to solve problems nature hasn't.

@jacobtref.bsky.social & I talk about the art of protein design 🧑‍🎨
The art of protein design with AI
YouTube video by Works in Progress
www.youtube.com
October 15, 2025 at 3:08 PM
Reposted by Jamie Cummins
Major win for our field: finally a large, replicable effect.
Results of the replication are in!

Chocolate is more desirable than poop:

Cohen's d_rm = 6.20, 95%CI [5.63, 6.78]

N = 486, two single item 1-7 Likert scales of desirability.

w/
@jamiecummins.bsky.social
Make an effect size prediction!

@jamiecummins.bsky.social and I are replicating Balcetis & Dunning's (2010) "chocolate is more desirable than poop" (Cohen's d = 4.52)

Let us known in the replies what effect size you think we'll find. Details of the study in the thread below.
October 15, 2025 at 11:29 AM