Constrained Policy Optimization
A need to limit agent's action often arise in reinforcement learning, i.e. from the point of safety, when we are dealing with human-AI interactions.
At the seminar, we will discuss a recent paper "Constrained Policy Optimization", that adapts trust region optimization for constrained MDP to guarantee constraints on every step of the training process.
Speaker: Ilya Kaysin.
Presentation language: Russian.
Date and Time: April 16th, 18:30-20:00.
Place: Times, room 204.
Videos from previous seminars are available at http://bit.ly/MLJBSeminars
- About seminars
12 November 2019Мета обучение с подкреплением
5 November 2019Использование внешней памяти в обучении с подкреплением
29 October 2019Human-level performance in first-person multiplayer games with population-based deep reinforcement learning
22 October 2019Модификация функции награды при помощи потенциальных функций
8 October 2019Обучение с подкреплением для автономных дронов