Over ons

Consortium

Team

Vlaams Beleidsplan AI

Maak kennis met onze Alumni

Contact

Description:
Two BDPI implementations, both in Python with the OpenAI Gym. The vub-ai-lab one is 100% custom code, and has high-performance multiprocessing (able to leverage 32 cores or more). The sb3-contrib version is part of a well-known repository of RL algorithms with a common API (stable-baselines3), and is made to be very easy to try out by novice users.

More Information:
<ul>
<li>Grand Challenge/Workpackage(s): <a href="https://www.flandersairesearch.be/en/research/research-challenges/collaborative-ai" target="_blank" rel="noopener">GC3</a> - WP2</li>
<li>Responsible Research Lead: Ann Now&eacute;</li>
<li>Authors: Denis Steckelmacher</li>
<li>Programming Language: Python</li>
<li>Link to the repository: <a href="https://github.com/vub-ai-lab/bdpi" target="_blank" rel="noopener">https://github.com/vub-ai-lab/bdpi</a>&nbsp;and <a href="https://github.com/steckdenis/stable-baselines3-contrib" target="_blank" rel="noopener">https://github.com/steckdenis/stable-baselines3-contrib</a></li>
</ul>

Bootstrapped Dual Policy Iteration: sample-efficient model-free RL with discrete actions