Preview Mode Links will not work in preview mode

AXRP - the AI X-risk Research Podcast

3 - Negotiable Reinforcement Learning with Andrew Critch

Dec 11, 2020

In this episode, I talk with Andrew Critch about negotiable reinforcement learning: what happens when two people (or organizations, or what have you) who have different beliefs and preferences jointly build some agent that will take actions in the real world. In the paper we discuss, it's proven that the only way to...

2 - Learning Human Biases with Rohin Shah

Dec 11, 2020

One approach to creating useful AI systems is to watch humans doing a task, infer what they're trying to do, and then try to do that well. The simplest way to infer what the humans are trying to do is to assume there's one goal that they share, and that they're optimally achieving the goal. This has the problem that...

1 - Adversarial Policies with Adam Gleave

Dec 11, 2020

In this episode, Adam Gleave and I talk about adversarial policies. Basically, in current reinforcement learning, people train agents that act in some kind of environment, sometimes an environment that contains other agents. For instance, you might train agents that play sumo with each other, with the objective...