Diffusion-Policy#
Installation#
cd roboverse_learn/algorithms/diffusion_policy
pip install -e .
cd ../../../
pip install pandas wandb
Usage of Training#
First run ‘data2zarr_dp.py’ to get zarr data
# --------- instruction format ---------#
python roboverse_learn/algorithms/diffusion_policy/data2zarr_dp.py <task_name> <expert_data_num> <metadata_dir> <split offset>
# The format of 'task_name' should be {task}_{robot}
# e.g.StackCube_franka / CloseBox_franka
# 'expert_data_num' means number of training data
# you can define it by yourself, but it should be less than the number of data in 'metadata_dir'
# 'metadata_dir' means the location of metadata
# --------- instruction example ---------#
python roboverse_learn/algorithms/diffusion_policy/data2zarr_dp.py CloseBox_franka 10 ~/Project/RoboVerse/RoboVerse/data_isaaclab/demo/CloseBox/robot-franka 1
The processed zarr data will be saved in ‘data_policy’. If you wanna move the zarr data to headless server, please make sure the new location is the same as ‘data_policy’ in the headless server.
ATTENTION:
In this script, we define ‘joint_qpos’ as the state and ‘joint_qpos_target’ as the action. If you wanna change, just modify the code in line 84 and line 85 in ‘roboverse_learn/algorithms/diffusion_policy/data2zarr_dp.py’.
By the way, if you change the ‘joint_qpos’ and ‘joint_qpos_target’ to other parameters, please check agent_pos:shape and action:shape in roboverse_learn/algorithms/diffusion_policy/diffusion_policy/config/task/default_task.yaml to make sure they are consistent with the shape of new parameters.
Then run ‘train.sh’ to train DP
# --------- instruction format ---------#
bash roboverse_learn/algorithms/diffusion_policy/train.sh <task_name> <expert_data_num> <seed> <gpu_id> <DEBUG>
# 'task_name' must be {task}_{robot}, as specified in file 'data2zarr_dp.sh'.
# e.g. StackCube_franka / CloseBox_franka
# 'expert_data_num' means number of training data. e.g.100
# 'seed' means random seed, select any number you like, e.g.42
# 'gpu_id' means single gpu id, e.g.0
# 'DEBUG' means whether to run in debug mode. e.g. False
# --------- instruction example ---------#
bash roboverse_learn/algorithms/diffusion_policy/train.sh CloseBox_franka 10 0 0
you can modify some parameters (including training epoch, batch_size) in ‘roboverse_learn/algorithms/diffusion_policy/diffusion_policy/config/robot_dp.yaml’
Here list some important parameters which can consider to change:
horizon (line 12)
n_obs_steps (line 13)
n_action_steps (line 14)
dataloader: batch_size (line 76)
val_dataloader: batch_size (line 83)
num_epochs (line 104)
checkpoint_every (line 113)
val_every (line 114)
Usage of Validation#
Use following instruction to run validation (customize some input parameters):
python roboverse_learn/eval.py --task CloseBox --sim isaaclab --checkpoint_path XXX(absolute_path)