Diffusion-Policy#

Installation#

cd roboverse_learn/algorithms/diffusion_policy
pip install -e .
cd ../../../

pip install pandas wandb

Usage of Training#

First run ‘data2zarr_dp.py’ to get zarr data

# --------- instruction format ---------#

python roboverse_learn/algorithms/diffusion_policy/data2zarr_dp.py  <task_name> <expert_data_num> <metadata_dir> <split offset>

# The format of 'task_name' should be {task}_{robot}
# e.g.StackCube_franka / CloseBox_franka

# 'expert_data_num' means number of training data
# you can define it by yourself, but it should be less than the number of data in 'metadata_dir'

# 'metadata_dir' means the location of metadata

# --------- instruction example ---------#

python roboverse_learn/algorithms/diffusion_policy/data2zarr_dp.py CloseBox_franka 10 ~/Project/RoboVerse/RoboVerse/data_isaaclab/demo/CloseBox/robot-franka 1

The processed zarr data will be saved in ‘data_policy’. If you wanna move the zarr data to headless server, please make sure the new location is the same as ‘data_policy’ in the headless server.

ATTENTION:

In this script, we define ‘joint_qpos’ as the state and ‘joint_qpos_target’ as the action. If you wanna change, just modify the code in line 84 and line 85 in ‘roboverse_learn/algorithms/diffusion_policy/data2zarr_dp.py’.

By the way, if you change the ‘joint_qpos’ and ‘joint_qpos_target’ to other parameters, please check agent_pos:shape and action:shape in roboverse_learn/algorithms/diffusion_policy/diffusion_policy/config/task/default_task.yaml to make sure they are consistent with the shape of new parameters.

Then run ‘train.sh’ to train DP

# --------- instruction format ---------#

bash roboverse_learn/algorithms/diffusion_policy/train.sh <task_name> <expert_data_num> <seed> <gpu_id> <DEBUG>

# 'task_name' must be {task}_{robot}, as specified in file 'data2zarr_dp.sh'.
# e.g. StackCube_franka / CloseBox_franka

# 'expert_data_num' means number of training data. e.g.100

# 'seed' means random seed, select any number you like, e.g.42

# 'gpu_id' means single gpu id, e.g.0

# 'DEBUG' means whether to run in debug mode. e.g. False

# --------- instruction example ---------#

bash roboverse_learn/algorithms/diffusion_policy/train.sh CloseBox_franka 10 0 0

you can modify some parameters (including training epoch, batch_size) in ‘roboverse_learn/algorithms/diffusion_policy/diffusion_policy/config/robot_dp.yaml’

Here list some important parameters which can consider to change:

horizon (line 12)
n_obs_steps (line 13)
n_action_steps (line 14)
dataloader: batch_size (line 76)
val_dataloader: batch_size (line 83)
num_epochs (line 104)
checkpoint_every (line 113)
val_every (line 114)

Usage of Validation#

Use following instruction to run validation (customize some input parameters):

python roboverse_learn/eval.py --task CloseBox --sim isaaclab --checkpoint_path XXX(absolute_path)

Diffusion-Policy#

Installation#

Usage of Training#

Usage of Validation#

This Page