Irrespective of the specific task at hand, there are common safety constraints that we expect our agents to adhere to in any given environment. These constraints ensure that the agents do not engage in actions that may cause harm or damage. However, manually defining these constraints can be time-consuming and prone to errors. In this paper, we propose a method to learn these safety constraints from demonstrations provided by experts who have successfully completed the tasks in a safe manner. We extend the inverse reinforcement learning (IRL) techniques to the realm of constraints. Essentially, we learn constraints by identifying behaviors that the expert could have potentially taken but deliberately chose not to, as they may lead to highly rewarding outcomes. Nonetheless, the problem of learning constraints is often challenging and tends to result in overly cautious constraints that prohibit any action not taken by the expert. To address this issue, we leverage the presence of diverse demonstrations that naturally occur in multi-task scenarios, which allows us to learn a more refined set of constraints. We validate the effectiveness of our approach through simulation experiments conducted on high-dimensional continuous control tasks.
Acquiring Safety Constraints from Multi-task Demonstrations for Enhanced Learning (arXiv:2309.00711v1 [cs.LG])
by instadatahelp | Sep 6, 2023 | AI Blogs