Bootstrapped Dual Policy Iteration: sample-efficient model-free RL with discrete actions


Two BDPI implementations, both in Python with the OpenAI Gym. The vub-ai-lab one is 100% custom code, and has high-performance multiprocessing (able to leverage 32 cores or more). The sb3-contrib version is part of a well-known repository of RL algorithms with a common API (stable-baselines3), and is made to be very easy to try out by novice users.

More Information: