Rendering OpenAI Gym Envs on Binder and Google Colab

I am currently using my COVID-19 imposed quarantine to expand my deep learning skills by completing the Deep Reinforcement Learning Nanodegree from Udacity. Almost immediately I ran into the tedious problem of getting my simulations to render properly when training on remote servers.
In particular, getting OpenAI Gym environments to render properly in remote servers such as those supporting popular free compute facilities such as Google Colab and Binder turned out to be more challenging than I expected. In this post, I lay out my solution in the hopes that I might save others time and effort to work it out independently.

Google Colab Preamble

If you wish to use Google Colab, then this section is for you! Otherwise, you can skip to the next section for the Binder Preamble.
First, you will need to install the necessary X11 dependencies, in particular Xvfb, which is an X server that can run on machines with no display hardware and no physical input devices. You can install system dependencies inside your Colab notebook by prepending the install command with an exclamation mark (!)which will run the command inside its own Bash shell.
Now that you have installed Xvfb, you need to install a Python wrapper pyvirtualdisplay in order to interact with Xvfb virtual displays from within Python. You also need to install the Python bindings for OpenGLPyOpenGL and PyOpenGL-accelerate. The former is the actual Python bindings, the latter is an optional set of C (Cython) extensions providing acceleration of common operations in PyOpenGL 3.x.
Next, you need to install the OpenAI Gym package. Note that depending on which Gym environment you are interested in working with you may need to add additional dependencies. Since I am going to simulate the LunarLander-v2 environment in my demo below I need to install the box2d extra which enables Gym environments that depend on the Box2D physics simulator.
For simplicity, I have gathered all the software installation steps into a single code block that you can cut and paste into your notebook.
Now that all the required software is installed you are ready to create a virtual display (i.e., a display that runs in the background) which the OpenAI Gym Envs can connect to for rendering purposes. You can actually check that there is no display at present by confirming that the value of the DISPLAY environment variable has not yet been set.
The code in the cell below creates a virtual display in the background that your Gym Envs can connect to for rendering. You can adjust the size of the virtual buffer as you like but you must set visible=False when working with Xvfb.
This code only needs to be run once in your notebook to start the display.

After running the code above in your notebook you can echo out the value of the DISPLAY environment variable again to confirm that you now have a display running.
For convenience, I have gathered the above steps into two cells that you can copy and paste into the top of you Google Colab notebooks.

Binder Preamble

If you wish to use Binder, then this section is for you!
Unlike Google Colab, with Binder you can bake all the required dependencies (including the X11 system dependencies!) into the Docker image on which the Binder instance is based using Binder config files. These config files can either live in the root directory of your Git repo or in a binder sub-directory (my preferred choice).
The first config file that needs to be defined is the apt.txt file which is used to install system dependencies. You can just create an appropriately named file and then list the dependencies you want to install (one per line). After a bit of trial and error, I hit on the following winning combination.
The second and config file is the standard environment.yml file used to define a Conda environment. If you are unfamiliar with Conda, then I suggest that you check out my recent articles on Getting started with Conda and Managing project-specific environments with Conda.
The final required config file is the requirements.txt file used by Conda to install any additional Python dependencies that are not available via Conda channels using pip.
If you are interested in learning more about Binder, then check out the documentation for BinderHub which is the underlying technology behind the Binder project.
Next, you need to create a virtual display in the background which the Gym Envs can connect to for rendering purposes. You can check that there is no display at present by confirming that the value of the DISPLAY environment variable has not yet been set.
The code in the cell below creates a virtual display in the background that your Gym Envs can connect to for rendering. You can adjust the size of the virtual buffer as you like but you must set visible=False when working with Xvfb.
This code only needs to be run once per session to start the display.

After running the cell above you can echo out the value of the DISPLAY environment variable again to confirm that you now have a display running.

Demo

Just to prove that the above setup works as advertised I will run a short simulation. First I define an Agent that chooses an action randomly from the set of possible actions and then define a function that can be used to create such agents. Then I wrap up the code to simulate a single episode of an OpenAI Gym environment. Note that the implementation assumes that the provided environment supports rgb_array rendering (which not all Gym environments support!).
Currently there appears to be a non-trivial amount of flickering during the simulation. Not entirely sure what is causing this undesirable behavior. If you have any idea how to improve this, please leave a comment below. I will be sure to update this post accordingly if I find a good fix.

Post a Comment

0 Comments