-
Notifications
You must be signed in to change notification settings - Fork 329
Description
First and foremost, I would like to express my sincere gratitude for your excellent work!
I have encountered the following issue and suspect that it might be a bug 🐛?
When training, set the "pos_enc" of "CropRandomizer" to true, as shown in the following code:
"obs_randomizer_class": "CropRandomizer",
"obs_randomizer_kwargs": {
"crop_height": 116,
"crop_width": 116,
"num_crops": 1,
"pos_enc": true
At this point, the dimension of the input image is 5 channels (three RGB channels plus two channels for position encoding)
However, during the evaluation, the following function(/robomimic/models/obs_core.py) did not perform positional encoding but only performed center cropping.
def _forward_in_eval(self, inputs):
"""
Do center crops during eval
"""
assert len(inputs.shape) >= 3 # must have at least (C, H, W) dimensions
inputs = inputs.permute(*range(inputs.dim()-3), inputs.dim()-2, inputs.dim()-1, inputs.dim()-3)
out = ObsUtils.center_crop(inputs, self.crop_height, self.crop_width)
out = out.permute(*range(out.dim()-3), out.dim()-1, out.dim()-3, out.dim()-2)
return out
However, in the code(/robomimic/models/obs_core.py) for training, there is a mechanism for handling positional encoding:
def _forward_in(self, inputs):
"""
Samples N random crops for each input in the batch, and then reshapes
inputs to [B * N, ...].
"""
assert len(inputs.shape) >= 3 # must have at least (C, H, W) dimensions
out, _ = ObsUtils.sample_random_image_crops(
images=inputs,
crop_height=self.crop_height,
crop_width=self.crop_width,
num_crops=self.num_crops,
pos_enc=self.pos_enc,
)
# [B, N, ...] -> [B * N, ...]
return TensorUtils.join_dimensions(out, 0, 1)
So, if "position encoding" is configured during training, errors like the following will occur during evaluation:
rollout: env=PnPCounterToCab, horizon=500, use_goals=False, num_episodes=50
0%| | 0/50 [00:00<?, ?it/s]Rollout exception at episode number 0!
Traceback (most recent call last):
File "/data3/xuyuxuan/robomimic/robomimic/utils/train_utils.py", line 563, in rollout_with_stats
rollout_info = run_rollout(
File "/data3/xuyuxuan/robomimic/robomimic/utils/train_utils.py", line 335, in run_rollout
ac = policy(ob=policy_ob, goal=goal_dict) #, return_ob=True)
File "/data3/xuyuxuan/robomimic/robomimic/algo/algo.py", line 689, in call
ac = self.policy.get_action(obs_dict=ob, goal_dict=goal)
File "/data3/xuyuxuan/robomimic/robomimic/algo/bc.py", line 785, in get_action
output = self.nets["policy"](obs_dict, actions=None, goal_dict=goal_dict)
File "/data3/xuyuxuan/miniforge3/envs/robocasa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data3/xuyuxuan/robomimic/robomimic/models/policy_nets.py", line 1334, in forward
out = self.forward_train(obs_dict=obs_dict, actions=actions, goal_dict=goal_dict)
File "/data3/xuyuxuan/robomimic/robomimic/models/policy_nets.py", line 1286, in forward_train
outputs = MIMO_Transformer.forward(self, **forward_kwargs)
File "/data3/xuyuxuan/robomimic/robomimic/models/obs_nets.py", line 1096, in forward
transformer_inputs = TensorUtils.time_distributed(
File "/data3/xuyuxuan/robomimic/robomimic/utils/tensor_utils.py", line 951, in time_distributed
outputs = op(**inputs, **kwargs)
File "/data3/xuyuxuan/miniforge3/envs/robocasa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data3/xuyuxuan/robomimic/robomimic/models/obs_nets.py", line 475, in forward
self.nets[obs_group].forward(inputs[obs_group])
File "/data3/xuyuxuan/robomimic/robomimic/models/obs_nets.py", line 256, in forward
x = self.obs_nets[k](x, lang_emb=obs_dict[LANG_EMB_KEY])
File "/data3/xuyuxuan/miniforge3/envs/robocasa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data3/xuyuxuan/robomimic/robomimic/models/obs_core.py", line 296, in forward
assert tuple(inputs.shape)[-ndim:] == tuple(self.input_shape)
AssertionError
0%| | 0/50 [00:09<?, ?it/s]
Epoch 0 Rollouts took -1s (avg) with results:
Env: PnPCounterToCab
{
"Return": -1,
"Success_Rate": -1,
"Time_Episode": -1,
"time": -1
}
BTW, the following code in the configuration file seems not to be working. It seems that it should overwrite the ""pos_enc": true" I set?
"meta": {
"hp_base_config_file": "/data3/xuyuxuan/robomimic/robomimic/exps/templates/bc_transformer.json",
"hp_keys": [
"seed",
"ds",
"ckpt",
"obsrandargs"
],
"hp_values": [
123,
"human-combined",
"single_and_multi_task_human_300pth",
{
"crop_height": 116,
"crop_width": 116,
"num_crops": 1,
"pos_enc": false
}
]
}
In the end, I solved this error by manually adding the code for handling "pos_enc" to the "forward_in_eval" function.
Thank you very much for your help!