Chair objectives
Deep reinforcement learning (RL) – learning optimal behaviors from interaction data, using deep neural networks – is often seen as one of the next frontiers in artificial intelligence.
While current RL algorithms do not escape the relentless pursuit of larger models, bigger data and more computation demands, we posit realworld impacts of RL will also stem from algorithms that are relevant in the small data regime, on reasonable computing architectures.
RL is at a crossroads where one wishes to retain the versatility and representational abilities of deep neural networks, while coping with limited data and resources.
Dans ces conditions, il est essentiel de comprendre comment préserver les propriétés de convergence algorithmique, la robustesse aux incertitudes, les garanties dans le pire des cas, les caractéristiques transférables ou les éléments d’explication du comportement.
C’est pourquoi nous tentons de mettre le RL “au régime”, afin de mieux comprendre le RL frugal : ses fondements théoriques, les nombreuses façons de compenser des données limitées, les algorithmes solides que l’on peut concevoir, et les impacts pratiques qu’elle peut avoir sur les nombreuses applications du monde réel, où, intrinsèquement, les données sont coûteuses et les ressources limitées, aussi bien dans le domaine de la robotique autonome que dans celui de la médecine personnalisée.
Inductive biases as regularizers in Markov decision Processes including:
- Explicit regularizers from expert or learned invariances
- Linking auxiliary losses, generalization properties and data frugality
- Régularisateurs implicites forts pour le contrôle basé sur l’image
- Function graph adaptation as a means to achieve sample efficiency
- Inductive biases from human demonstrations
Transferring knowledge from cost-free playgrounds to complex real-life problems
- Resolution invariant learning and transfer across environments of various complexities
- Task-agnostic representation learning
- Foundation models for RL as a basis for frugality
Optimization methods tailored for frugal RL
- Model-based RL with rich function graph representations and efficient policy optimization
- RL as semi-infinite programming, and its link to sample efficient learning
- Official support from AIRBUS, Vitesco Technologies, IRT Saint Exupéry, ONERA.
- Autonomous mobile robotics. Ongoing PhD project of H. Bonnavaud on mission planning (started 2021, with AID the French defense research agency), ongoing PhD project of A. Zouitine on robust control (started 2021, with IRT Saint Exupéry).
- Controlling complex simulated or real-life fluid dynamics processes. Ongoing PhD project of B. Martin on gust control (started 2020), ongoing pre-doctoral project of B. Corban on real-life experimental control (started 2023, with EPFL), post-doc of S. Berger on scalability and off-the-shelf usability of RL methods (2021-2023, with AID).
- Dynamic treatment regimes for cancerous host/tumor processes. PhD project of H. Li (starting 2024, with INSERM).
- Formal foundations for frugal RL
- New formulations and algorithms for deep RL generalization, transfer and robustness in the face of data sparsity, and related challenges
- Useable algorithms for real-life RL