Constrained Policy Optimization
A need to limit agent's action often arise in reinforcement learning, i.e. from the point of safety, when we are dealing with human-AI interactions.
At the seminar, we will discuss a recent paper "Constrained Policy Optimization", that adapts trust region optimization for constrained MDP to guarantee constraints on every step of the training process.
Speaker: Ilya Kaysin.
Presentation language: Russian.
Date and Time: April 16th, 18:30-20:00.
Place: Times, room 204.
Videos from previous seminars are available at http://bit.ly/MLJBSeminars
- About seminars
18 May 2020The AI Economist
11 May 2020Self-Tuning Deep Reinforcement Learning
27 April 2020Sample Efficiency in RL
20 April 2020Silly rules improve the capacity of agents to learn stable enforcement and compliance behaviors
13 April 2020AlphaGo to MuZero. Победа компьютера над человеком в интеллектуальных играх.