Share this comment
Great post! I highly recommend looking into the literature on assistance games, which is a specific proposal for how to get AI to infer our intentions, rather than optimize a prespecified reward. See eg arxiv.org/abs/1606.03137
I think this area will become very relevant very soon, not just from a safety perspective, but even just for exp…
© 2025 Substack Inc
Substack is the home for great culture
Great post! I highly recommend looking into the literature on assistance games, which is a specific proposal for how to get AI to infer our intentions, rather than optimize a prespecified reward. See eg https://arxiv.org/abs/1606.03137
I think this area will become very relevant very soon, not just from a safety perspective, but even just for expanding the set of tasks that AI can take on - as you mention, reward design is not a great strategy.
Thanks, and thank you for the pointer. Looking into it.