Skip to content

During the training process, the configuration of "pos_enc" in "CropRandomizer" was handled properly, but it was skipped during evaluation. #262

@WhyxyxIsNotAnAvailableName

Description

First and foremost, I would like to express my sincere gratitude for your excellent work!
I have encountered the following issue and suspect that it might be a bug 🐛?

When training, set the "pos_enc" of "CropRandomizer" to true, as shown in the following code:

                "obs_randomizer_class": "CropRandomizer",
                "obs_randomizer_kwargs": {
                    "crop_height": 116,
                    "crop_width": 116,
                    "num_crops": 1,
                    "pos_enc": true

At this point, the dimension of the input image is 5 channels (three RGB channels plus two channels for position encoding)
However, during the evaluation, the following function(/robomimic/models/obs_core.py) did not perform positional encoding but only performed center cropping.

    def _forward_in_eval(self, inputs):
        """
        Do center crops during eval
        """
        assert len(inputs.shape) >= 3 # must have at least (C, H, W) dimensions
        inputs = inputs.permute(*range(inputs.dim()-3), inputs.dim()-2, inputs.dim()-1, inputs.dim()-3)
        out = ObsUtils.center_crop(inputs, self.crop_height, self.crop_width)
        out = out.permute(*range(out.dim()-3), out.dim()-1, out.dim()-3, out.dim()-2)
        return out

However, in the code(/robomimic/models/obs_core.py) for training, there is a mechanism for handling positional encoding:

    def _forward_in(self, inputs):
        """
        Samples N random crops for each input in the batch, and then reshapes
        inputs to [B * N, ...].
        """
        assert len(inputs.shape) >= 3 # must have at least (C, H, W) dimensions
        out, _ = ObsUtils.sample_random_image_crops(
            images=inputs,
            crop_height=self.crop_height,
            crop_width=self.crop_width,
            num_crops=self.num_crops,
            pos_enc=self.pos_enc,
        )
        # [B, N, ...] -> [B * N, ...]
        return TensorUtils.join_dimensions(out, 0, 1)

So, if "position encoding" is configured during training, errors like the following will occur during evaluation:

rollout: env=PnPCounterToCab, horizon=500, use_goals=False, num_episodes=50

0%| | 0/50 [00:00<?, ?it/s]Rollout exception at episode number 0!
Traceback (most recent call last):
File "/data3/xuyuxuan/robomimic/robomimic/utils/train_utils.py", line 563, in rollout_with_stats
rollout_info = run_rollout(
File "/data3/xuyuxuan/robomimic/robomimic/utils/train_utils.py", line 335, in run_rollout
ac = policy(ob=policy_ob, goal=goal_dict) #, return_ob=True)
File "/data3/xuyuxuan/robomimic/robomimic/algo/algo.py", line 689, in call
ac = self.policy.get_action(obs_dict=ob, goal_dict=goal)
File "/data3/xuyuxuan/robomimic/robomimic/algo/bc.py", line 785, in get_action
output = self.nets["policy"](obs_dict, actions=None, goal_dict=goal_dict)
File "/data3/xuyuxuan/miniforge3/envs/robocasa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data3/xuyuxuan/robomimic/robomimic/models/policy_nets.py", line 1334, in forward
out = self.forward_train(obs_dict=obs_dict, actions=actions, goal_dict=goal_dict)
File "/data3/xuyuxuan/robomimic/robomimic/models/policy_nets.py", line 1286, in forward_train
outputs = MIMO_Transformer.forward(self, **forward_kwargs)
File "/data3/xuyuxuan/robomimic/robomimic/models/obs_nets.py", line 1096, in forward
transformer_inputs = TensorUtils.time_distributed(
File "/data3/xuyuxuan/robomimic/robomimic/utils/tensor_utils.py", line 951, in time_distributed
outputs = op(**inputs, **kwargs)
File "/data3/xuyuxuan/miniforge3/envs/robocasa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data3/xuyuxuan/robomimic/robomimic/models/obs_nets.py", line 475, in forward
self.nets[obs_group].forward(inputs[obs_group])
File "/data3/xuyuxuan/robomimic/robomimic/models/obs_nets.py", line 256, in forward
x = self.obs_nets[k](x, lang_emb=obs_dict[LANG_EMB_KEY])
File "/data3/xuyuxuan/miniforge3/envs/robocasa/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data3/xuyuxuan/robomimic/robomimic/models/obs_core.py", line 296, in forward
assert tuple(inputs.shape)[-ndim:] == tuple(self.input_shape)
AssertionError

0%| | 0/50 [00:09<?, ?it/s]

Epoch 0 Rollouts took -1s (avg) with results:
Env: PnPCounterToCab
{
"Return": -1,
"Success_Rate": -1,
"Time_Episode": -1,
"time": -1
}

BTW, the following code in the configuration file seems not to be working. It seems that it should overwrite the ""pos_enc": true" I set?

    "meta": {
        "hp_base_config_file": "/data3/xuyuxuan/robomimic/robomimic/exps/templates/bc_transformer.json",
        "hp_keys": [
            "seed",
            "ds",
            "ckpt",
            "obsrandargs"
        ],
        "hp_values": [
            123,
            "human-combined",
            "single_and_multi_task_human_300pth",
            {
                "crop_height": 116,
                "crop_width": 116,
                "num_crops": 1,
                "pos_enc": false
            }
        ]
    }

In the end, I solved this error by manually adding the code for handling "pos_enc" to the "forward_in_eval" function.

Thank you very much for your help!

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions