Mentions légales du service

Skip to content
Snippets Groups Projects
Commit 720b3bad authored by REINKE Chris's avatar REINKE Chris
Browse files

updated readme

parent ac40db59
No related branches found
No related tags found
No related merge requests found
# Xi-Learning # Xi-Learning
Source code of the Xi-Learning framework and its experimental evaluation for the paper: [Xi-learning: Successor Feature Transfer Learning for General Reward Functions](https://www.arxiv.org). Source code of the Xi-Learning framework and its experimental evaluation for the paper: [Xi-learning: Successor Feature Transfer Learning for General Reward Functions](https://arxiv.org/abs/2110.15701).
Authors: [Chris Reinke](https://www.scirei.net/), [Xavier Alameda-Pineda](http://xavirema.eu/) Authors: [Chris Reinke](https://www.scirei.net/), [Xavier Alameda-Pineda](http://xavirema.eu/)
Copyright: INRIA, 2021 Copyright: [INRIA](https://www.inria.fr/fr), 2021
License: GNU General Public License v3.0 or later License: [GNU General Public License v3.0 or later](https://gitlab.inria.fr/robotlearn/xi_learning/-/blob/master/license.txt)
<!-- ## Introduction Blog post with more details about the project: [Xi-Learning](https://team.inria.fr/robotlearn/xi_learning/)
Xi-Learning is a Reinforcement Learning framework for Transfer Learning between tasks that differ in their reward functions.
It is based on the concept of Successor Features. ## Abstract
Xi agents learn -->
Transfer in Reinforcement Learning aims to improve learning performance on target tasks using knowledge from experienced source tasks. Successor features (SF) are a prominent transfer mechanism in domains where the reward function changes between tasks. They reevaluate the expected return of previously learned policies in a new target task and to transfer their knowledge. A limiting factor of the SF framework is its assumption that rewards linearly decompose into successor features and a reward weight vector. We propose a novel SF mechanism, ξ-learning, based on learning the cumulative discounted probability of successor features. Crucially, ξ-learning allows to reevaluate the expected return of policies for general reward functions. We introduce two ξ-learning variations, prove its convergence, and provide a guarantee on its transfer performance. Experimental evaluations based on ξ-learning with function approximation demonstrate the prominent advantage of ξ-learning over available mechanisms not only for general reward functions but also in the case of linearly decomposable reward functions.
## Setup ## Setup
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment