@prakhargodara.bsky.social
Physics PhD, now exploring questions involving learning and decision-making. Postdoc at NYU. Curious and open to chats.
Maybe my personal message didn't reach you.
With a quick numerical test, I recover confirmation bias in both the halves when fitting to Bayes-optimal behavior. Therefore, I do not think that the schematic is accurate.
I can share my code.
With a quick numerical test, I recover confirmation bias in both the halves when fitting to Bayes-optimal behavior. Therefore, I do not think that the schematic is accurate.
I can share my code.
October 20, 2025 at 5:59 PM
Maybe my personal message didn't reach you.
With a quick numerical test, I recover confirmation bias in both the halves when fitting to Bayes-optimal behavior. Therefore, I do not think that the schematic is accurate.
I can share my code.
With a quick numerical test, I recover confirmation bias in both the halves when fitting to Bayes-optimal behavior. Therefore, I do not think that the schematic is accurate.
I can share my code.
I am not sure this eliminates the possibility of temporal variation in learning rates.
Are you saying that had the learning rates been decaying with time, we would not have observed this effect?
Are you saying that had the learning rates been decaying with time, we would not have observed this effect?
October 20, 2025 at 3:27 PM
I am not sure this eliminates the possibility of temporal variation in learning rates.
Are you saying that had the learning rates been decaying with time, we would not have observed this effect?
Are you saying that had the learning rates been decaying with time, we would not have observed this effect?
Generally, I am empathetic to the idea that humans are biased. My only concern is that the arguments, as they stand, are not in their most robust form. I've responded below to highlight some of my concerns.
October 20, 2025 at 3:19 PM
Generally, I am empathetic to the idea that humans are biased. My only concern is that the arguments, as they stand, are not in their most robust form. I've responded below to highlight some of my concerns.
From what I recall from my RLDM chat, the confidence estimates were self-reported. Correct?
If so, it is unclear to me what dynamical features it introduces. I think it would require a detailed analysis.
If so, it is unclear to me what dynamical features it introduces. I think it would require a detailed analysis.
October 20, 2025 at 3:16 PM
From what I recall from my RLDM chat, the confidence estimates were self-reported. Correct?
If so, it is unclear to me what dynamical features it introduces. I think it would require a detailed analysis.
If so, it is unclear to me what dynamical features it introduces. I think it would require a detailed analysis.
It's not true that normatively learning rates should not decay in volatile tasks (see Eq. 10 in [1])
Also what's normative depends on the assumed model class (eg change-points vs random-walk). In change-point models, rates spike at a changes and decay otherwise.
[1] papers.nips.cc/paper_files/...
Also what's normative depends on the assumed model class (eg change-points vs random-walk). In change-point models, rates spike at a changes and decay otherwise.
[1] papers.nips.cc/paper_files/...
Demystifying excessively volatile human learning: A Bayesian persistent prior and a neural approximation
papers.nips.cc
October 20, 2025 at 3:12 PM
It's not true that normatively learning rates should not decay in volatile tasks (see Eq. 10 in [1])
Also what's normative depends on the assumed model class (eg change-points vs random-walk). In change-point models, rates spike at a changes and decay otherwise.
[1] papers.nips.cc/paper_files/...
Also what's normative depends on the assumed model class (eg change-points vs random-walk). In change-point models, rates spike at a changes and decay otherwise.
[1] papers.nips.cc/paper_files/...
Technical remark:
In this study I use Master equations (commonly used in statistical physics) to derive analytical expressions for key observables. This approach could be very useful for studying learning dynamics of RL algorithms without having to run costly simulations.
In this study I use Master equations (commonly used in statistical physics) to derive analytical expressions for key observables. This approach could be very useful for studying learning dynamics of RL algorithms without having to run costly simulations.
October 18, 2025 at 9:21 PM
Technical remark:
In this study I use Master equations (commonly used in statistical physics) to derive analytical expressions for key observables. This approach could be very useful for studying learning dynamics of RL algorithms without having to run costly simulations.
In this study I use Master equations (commonly used in statistical physics) to derive analytical expressions for key observables. This approach could be very useful for studying learning dynamics of RL algorithms without having to run costly simulations.
Conclusions: We need a more robust methodology to estimate the temporal variations in learning rates (I provide a suggestion). Without modelling the temporal dynamics of the learning rates, making claims about bias would be problematic.
Full paper: www.pnas.org/doi/10.1073/...
Full paper: www.pnas.org/doi/10.1073/...
Apparent learning biases emerge from optimal inference: Insights from master equation analysis | PNAS
Recent studies [S. Palminteri, G. Lefebvre, E. J. Kilford, S. J. Blakemore, PLoS Comput.
Biol. 13, e1005684 (2017); G. Lefebvre, M. Lebreton, F. Me...
www.pnas.org
October 18, 2025 at 9:21 PM
Conclusions: We need a more robust methodology to estimate the temporal variations in learning rates (I provide a suggestion). Without modelling the temporal dynamics of the learning rates, making claims about bias would be problematic.
Full paper: www.pnas.org/doi/10.1073/...
Full paper: www.pnas.org/doi/10.1073/...
The culprit? A fundamental model misspecification issue. Optimal learning has decreasing rates (for vanilla bandit tasks); vanilla Q-learning assumes fixed ones. Decreasing rates cause (not always) decreased action switching. Only way to have that with constant rates is through a bias.
October 18, 2025 at 9:21 PM
The culprit? A fundamental model misspecification issue. Optimal learning has decreasing rates (for vanilla bandit tasks); vanilla Q-learning assumes fixed ones. Decreasing rates cause (not always) decreased action switching. Only way to have that with constant rates is through a bias.
"Asymmetry is apparent, only if we assume people are Bayesian"
This is not quite accurate. In the paper I show that there is a large class of temporal profiles of the learning rate (that are not Bayes optimal) that might lead to the appearance of bias.
This is not quite accurate. In the paper I show that there is a large class of temporal profiles of the learning rate (that are not Bayes optimal) that might lead to the appearance of bias.
October 14, 2025 at 3:43 PM
"Asymmetry is apparent, only if we assume people are Bayesian"
This is not quite accurate. In the paper I show that there is a large class of temporal profiles of the learning rate (that are not Bayes optimal) that might lead to the appearance of bias.
This is not quite accurate. In the paper I show that there is a large class of temporal profiles of the learning rate (that are not Bayes optimal) that might lead to the appearance of bias.