DreamerV3 Report
Papers:
RSSM
DreamerV1
DreamerV2
DreamerV3
Projects
THere're currently 3 codebases about DreamerV3. The most readable one is sheeprl's codebase.
- The author's implementation, written in Jax: https://github.com/danijar/dreamerv3
- A PyTorch implementation by NM512: https://github.com/NM512/dreamerv3-torch
- A PyTorch implementation by sheeprl: https://github.com/Eclectic-Sheep/sheeprl
Besides, there're some impementations for DreamerV2 and DreamerV1.
- adityabingi's Dreamerv1 & v2 codebase
- EasyDreamer: A Simplified Version of the DreamerV1 Algorithm with Pytorch
There're also some explanations of the code:
- some code notes from a Reddit user
- implementations of the tricks in cleanRL, applied to PPO.: This obviously doesn't include any of the world model architecture or loss functions, just the new tricks introduced by DreamerV3
sheeprl
Installation:
Make sure you have g++
installed.
1 | conda create -n sheeprl python=3.9 |
Install osmesa:
1 | sudo apt-get install libgl1-mesa-glx libosmesa6 |
Set:
1 | export MUJOCO_GL=osmesa |
Hafner's version
Hafner's codebase
See my fork: https://github.com/LYK-love/dreamerv3
NM512's version
Github: NM512's PyTorch implementation
1 | conda create -n DreamerTorch python=3.9 |
In addition, you need to install Atari ROMs to run Atari envs, follow here to download and install ROM:
1 | wget http://www.atarimania.com/roms/Atari-2600-VCS-ROM-Collection.zip ./ |
This should print out the names of ROMs as it imports them. The ROMs will be copied to your atari_py installation directory.
Commands
Here I introduce my commands for running the codebase of sheeprl.
For Hafners'
1 | python dreamerv3/train.py --logdir ./logdir --configs atari --batch_size 16 --run.train_ratio 32 |
1 | python dreamerv3/train.py \ |
1 | python dreamerv3/train.py \ |
Tricks
Multi-GPU
Use multi -GPU supported by Ligntening Fabric
1 | fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 |
Mixed-precision
Use mixed precision supported by Ligntening Fabric:
1 | fabric.precision=16-mixed |
Log
See logs:
1 | tensorboard --logdir logs |
Map generated videos to my photoview:
1 | ln -s /home/lyk/Projects/sheeprl/logs $IMAGE_HOME/sheeprl_log |
Commands for Hafner's
Log: --run.log_every 3
1 | export CKPT="logdir/BouncingBall/checkpoint.ckpt" |
Making ckpts:
1 | WANDB_MODE=online python dreamerv3/train.py --logdir ./logdir/$(date "+%Y%m%d-%H%M%S") --configs bouncing_ball small --batch_size 16 --run.train_ratio 32 --run.steps 5000000 --run.only_train False |
1 | WANDB_MODE=online python dreamerv3/train.py --logdir ./logdir/$(date "+%Y%m%d-%H%M%S") --configs grid_world small --batch_size 16 --run.train_ratio 32 |
1 | python dreamerv3/train.py --logdir ./logdir/$(date "+%Y%m%d-%H%M%S") --configs grid_world debug --batch_size 16 --run.train_ratio 32 |
Video pinball (If have a logfir, load it. Otherwise train it from scratch.)
1 | export LOGDIR="logdir/VideoPinball" |
Scripts
https://github.com/Eclectic-Sheep/sheeprl/blob/main/howto/configs.md
https://github.com/Eclectic-Sheep/sheeprl/tree/main/howto
DMC
Box2D
CarRacing
CarRacing
1 | python sheeprl.py exp=dreamer_v3 env=gym env.id=CarRacing-v2 algo.cnn_keys.encoder=\[rgb\] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed algo.learning_starts=1024 |
8 GPUS:
1 | python sheeprl.py exp=dreamer_v3 env=gym env.id=CarRacing-v2 algo.cnn_keys.encoder=\[rgb\] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed algo.learning_starts=1024 |
For dev:
1 | python sheeprl.py exp=dreamer_v3 env=gym env.id=CarRacing-v2 algo.cnn_keys.encoder=[rgb] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed algo.learning_starts=200 env.num_envs=1 |
For testing videos:
1 | python sheeprl.py exp=dreamer_v3 env=gym env.id=CarRacing-v2 algo.cnn_keys.encoder=\[rgb\] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed env.num_envs=2 algo.learning_starts=1024 algo.total_steps=2000000 |
total_steps
should be small.
I didn't set env.num_envs
, by default it should be 4.
Eval from checkpoint:
1 | export CKPT="/home/lyk/Projects/sheeprl/logs/runs/dreamer_v3/VideoPinballNoFrameskip-v4/2024-02-18_02-56-29_dreamer_v3_VideoPinballNoFrameskip-v4_42/version_0/checkpoint/ckpt_4800000_0.ckpt" |
Or
1 | seeds=(5 1024 42 1337 8 2) |
Atari
See gym's atari game list for all atari envs.
Alien
Single gpu:
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=AlienNoFrameskip-v4 algo.cnn_keys.encoder=\[rgb\] algo.mlp_keys.encoder=\[\] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.devices=1 fabric.precision=16-mixed algo.learning_starts=1024 |
2 gpus:
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=AlienNoFrameskip-v4 algo.cnn_keys.encoder=\[rgb\] algo.mlp_keys.encoder=\[\] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=8 fabric.precision=16-mixed algo.learning_starts=1024 |
8 gpus:
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=AlienNoFrameskip-v4 algo.cnn_keys.encoder=\[rgb\] algo.mlp_keys.encoder=\[\] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=8 |
For testing videos:
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=AlienNoFrameskip-v4 algo.cnn_keys.encoder=\[rgb\] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed env.num_envs=1 algo.learning_starts=1024 algo.total_steps=200000 |
Video pinball
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=VideoPinballNoFrameskip-v4 algo.cnn_keys.encoder=\[rgb\] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed algo.learning_starts=1024 |
For testing videos:
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=VideoPinballNoFrameskip-v4 algo.cnn_keys.encoder=\[rgb\] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed env.num_envs=1 algo.learning_starts=1024 algo.total_steps=2000000 |
total_steps
should be small.
Alien:
Venture:
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=VentureNoFrameskip-v4 algo.cnn_keys.encoder=\[rgb\] algo=dreamer_v3_M fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed algo.learning_starts=1024 |
Star Gunner
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=StarGunnerNoFrameskip-v4 algo.cnn_keys.encoder=\[rgb\] algo.mlp_keys.encoder=\[\] algo=dreamer_v3_M fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed algo.learning_starts=1024 |
8 gpus:
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=StarGunnerNoFrameskip-v4 algo.cnn_keys.encoder=\[rgb\] algo.mlp_keys.encoder=\[\] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=8 |
Private Eye:
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=PrivateEyeNoFrameskip-v4 algo.cnn_keys.encoder=\[rgb\] algo=dreamer_v3_M fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed algo.learning_starts=1024 |
Riverraid:
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=RiverraidNoFrameskip algo.cnn_keys.encoder=\[rgb\] algo=dreamer_v3_M fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed algo.learning_starts=1024 |
Boxing
1 | python sheeprl.py exp=dreamer_v3_100k_boxing fabric.strategy=ddp fabric.devices=8 fabric.accelerator=cuda |
Crafter
1 | python sheeprl.py exp=dreamer_v3_XL_crafter fabric.strategy=ddp fabric.devices=8 fabric.accelerator=cuda |
Pacman
1 | python sheeprl.py exp=dreamer_v3_100k_ms_pacman fabric.strategy=ddp fabric.devices=8 fabric.accelerator=cuda |
Custom envs
1 | pip install matplotlib |
Errors
render backend:
https://github.com/google-deepmind/dm_control/issues/123
Try to use osmesa
MuJoCo/DMC supports three different OpenGL rendering backends: EGL (headless), GLFW (windowed), and OSMesa (headless). For each of them, you need to install some packages:
- GLFW:
sudo apt-get install libglfw3 libglew2.2
- EGL:
sudo apt-get install libglew2.2
- OSMesa:
sudo apt-get install libgl1-mesa-glx libosmesa6
In order to use one of these rendering backends, you need to set theMUJOCO_GL
environment variable to"glfw"
,"egl"
,"osmesa"
, respectively.
Note
The libglew2.2
could have a different name, based on your OS (e.g., libglew2.2
is for Ubuntu 22.04.2 LTS).
It could be necessary to install also the PyOpenGL-accelerate
package with the pip install PyOpenGL-accelerate
command and the mesalib
package with the conda install conda-forge::mesalib
command.
For more information: https://github.com/deepmind/dm_control and https://mujoco.readthedocs.io/en/stable/programming/index.html#using-opengl.
When using osmesa, you may get:
1 | libGL error: MESA-LOADER: failed to open swrast: /usr/lib/dri/swrast_dri.so: cannot open shared object file: No such file or directory (search paths /usr/lib/x86_64-linux-gnu/dri:\$${ORIGIN}/dri:/usr/lib/dri, suffix _dri) |
Solution is copied from here. We can see that we have a swrast_dri.so
in /usr/lib/x86_64-linux-gnu/dri/
. So we can just create a symbol link to it:
1 | sudo mkdir /usr/lib/dri |
If you interrupt the running command, you might get
1 | Inconsistency detected by ld.so: ../sysdeps/x86_64/dl-machine.h: 534: elf_machine_rela_relative: Assertion `ELFW(R_TYPE) (reloc->r_info) == R_X86_64_RELATIVE' failed! |
at next execution.
Solution: reboot the system.
Error:
1 | Error in call to target 'gymnasium.envs.registration.make': |
Solution:
1 | pip install gymnasium[box2d] |
When install box2d-py
:
1 | gcc -pthread -B /home/lyk/miniconda3/envs/sheeprl/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -O2 -Wall -fPIC -O2 -isystem /home/lyk/miniconda3/envs/sheeprl/include -I/home/lyk/miniconda3/envs/sheeprl/include -fPIC -O2 -isystem /home/lyk/miniconda3/envs/sheeprl/include -fPIC -I/home/lyk/miniconda3/envs/sheeprl/include/python3.9 -c Box2D/Box2D_wrap.cpp -o build/temp.linux-x86_64-cpython-39/Box2D/Box2D_wrap.o -I. -Wno-unused |
Solution:
This problem can happen if different versions of g++ and gcc are installed.
1 | g++ --version |
Reason:
Ubuntu 22.04 default GCC version does not match version that built latest default kernel. On Ubuntu 22.04, the default GNU C compiler version is gcc-11. However, it appears that the latest default kernel version (6.5. 0-14-generic as of writing this question) is built using gcc-12.
1 | ValueError: bad marshal data (unknown type code) |
If you get that error, the compiled version of the Python module (the .pyc file) is corrupt probably. Gentoo Linux provides python-updater
, but in Debian the easier way to fix: just delete the .pyc file. If you don't know the pyc, just delete all of them (as root):
1 | find <the error dir> -name '*.pyc' -delete |
Training process
I use sheeprl code base to trian the dreamer_v3_XS
model (max_steps = 5,000,000), it'll take about 50 hours.
1 | python sheeprl.py exp=dreamer_v3 env=atari env.id=VideoPinballNoFrameskip-v4 algo.cnn_keys.encoder=\[rgb\] algo=dreamer_v3_XS fabric.accelerator=gpu fabric.strategy=ddp fabric.devices=2 fabric.precision=16-mixed algo.learning_starts=1024 |
So in average, every 1,000,000 steps cost 10 hours. Meanwile, the output of the decoder is blurry and has a lot of colorful noise at step=1,200,000. I'll wait until step=5,000,000 to see if these noise still exists.