.
Reposted by Mirco Musolesi
We replace the entropy bonus in PPO with a *complexity* bonus, encouraging structured and stochastic policies that are robust to different scaling factors and can work in environments with variable exploration needs.
Read more:
arxiv.org/abs/2509.20509
w/ @mircomusolesi.bsky.social
Reposted by Mirco Musolesi
Reposted by Mirco Musolesi
Link to the paper: dl.acm.org/doi/pdf/10.1...
#KDD2025
Reposted by Mirco Musolesi
Reposted by Mirco Musolesi
Information here: ucl.ac.uk/security-cri...
Closing date: 15 April 2025.
Reposted by Mirco Musolesi
Reposted by Mirco Musolesi
Session details: Spot 151 in Session 2 (Sun, May 4).
Link to the paper: openreview.net/attachment?i...
Reposted by Mirco Musolesi
Time: 2pm ET
Room: Richard
Link to the paper: arxiv.org/abs/2504.12777
@aamasconf.bsky.social
Reposted by Mirco Musolesi
www.ucl.ac.uk/security-cri...
Closing date: 15 June 2025.