we could use AI to comprehensively upgrade the alignment of the proxy metrics we're optimizing?
aligned AIs as the meta-optimizers?
we could use AI to comprehensively upgrade the alignment of the proxy metrics we're optimizing?
aligned AIs as the meta-optimizers?
but it also makes it easier to define better proxies
but it also makes it easier to define better proxies
it's worse the more powerful your optimizer is
prob every metric is its own little instance of the alignment problem, it's just that most aren't being optimized hard enough to be problematic
it's worse the more powerful your optimizer is
prob every metric is its own little instance of the alignment problem, it's just that most aren't being optimized hard enough to be problematic