Josef Woldense
woldense.bsky.social
Josef Woldense
@woldense.bsky.social
josefwoldense.com
Just listened to an episode about this:

What Next: TBD | Tech, power, and the future: How Meta Profits Off Fraud

Episode webpage: slate.com/podcasts/wha...
What Next: TBD
Tech, power, and the future with Lizzie O’Leary.
slate.com
November 16, 2025 at 5:01 PM
Aditya published a new article...🧐...let me see what new creative research design he's implementing... Never fails

Congrats!
November 8, 2025 at 10:07 PM
Giving coaches these crazy contracts, but then complain about paying players....🤔
November 2, 2025 at 8:40 PM
Awesome! Makes me think how many other distributions could be represented in this way.

🤔...I guess the binomial could be with the Galton board by assuming balls on either side of the mean to be yes/no
October 30, 2025 at 12:10 PM
You just got them another customer. Time to purchase that sweet board.

Do you use it when teaching?
October 30, 2025 at 12:02 PM
An entire emotional arch captured in one moment. By far one of my favorite memes
October 24, 2025 at 8:41 PM
Just out of curiosity, what problems do you see with strong towns?
October 16, 2025 at 3:03 PM
Cool paper. I'm going to shamelessly plug my work here that also deals with LLMs for research

bsky.app/profile/wold...
Paper alert 📣

Rapid advances in AI has some believe that LLM agents can replace real participants in human-subject research. If true, this would be huge!

Following a growing body of research, we delve deeper into this topic and examine the merits of this claim.

🧵...

arxiv.org/abs/2509.03736
Are LLM Agents Behaviorally Coherent? Latent Profiles for Social Simulation
The impressive capabilities of Large Language Models (LLMs) have fueled the notion that synthetic agents can serve as substitutes for real participants in human-subject research. In an effort to evalu...
arxiv.org
October 8, 2025 at 11:22 AM
Congrats 🎉... looking forward to the coming research
September 18, 2025 at 11:54 AM
Congrats!
September 18, 2025 at 11:50 AM
Congrats!
September 14, 2025 at 12:45 AM
dongyeopkang.bsky.social
dongyeopkang.bsky.social
September 8, 2025 at 9:29 PM
There is more in the paper, but broadly speaking, our results identify a deceptive problem: surface-level plausibility masking deeper failure modes. Agents appear internally consistent while concealing systematic incoherence.

Be careful when using LLMs as human substitutes. They might fool you.
September 8, 2025 at 7:03 PM
Take pairs where one of the agents has a preference of 1. Next, take pairs where one of the agents has a preference of 5. Now compare them. You can see pairs with a 1 have lower agreement scores than pairs with a 5. This is consistent across preference gaps
September 8, 2025 at 7:03 PM
Let me give you another one.

If we both equally dislike soda, our common ground should lead to high agreement. Not so with our agents.
September 8, 2025 at 7:03 PM