Swin Transformer Object Detection代码复现采坑记录

  • Post author:
  • Post category:其他


  1. 下载Swin Transformer Object Detection官方代码

    Swin-Transformer-Object-Detection

  2. 按照官方readme安装相关环境和软件包

  3. 根据官网mmdetection安装完mmcv和mmdet后,测试代码会遇到一个坑,显示mmcv版本过大(mmcv==1.4.2),看了官网后,发现版本并没有问题,此处需要修改mmdet/

    init

    .py文件:

mmcv_minimum_version = '1.3.17'
mmcv_maximum_version = '1.5.0'
mmcv_version = digit_version(mmcv.__version__)

保持mmcv最大版本和最低版本与 mmdetection官网版本一致即可

  1. 安装完相关包后,会报如下错误:
TypeError: MaskRCNN: SwinTransformer: __init__() got an unexpected keyword argument 'embed_dim'

需要按照mmdetection官网指示,运行:

pip install -v -e .
  1. 还需要安装apex安装包,安装它是相当的难受。可以先按照官方的那个操作来,如果安装不了,可以下载别的博主提供提供的apex文件,然后再按照官方的那个操作再来一遍。注意啊,apex安装不成功的原因基本就是cuda版本和torch版本不匹配,这个我已经有心理阴影了,在这里不多提。假设你这边下载了我这边的apex文件,然后按照以下操作:
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
  1. 运行一下代码:
python ./tools/train.py configs/swin/cascade_mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_1x_coco.py --cfg-options model.pretrained=./checkpoints/swin_tiny_patch4_window7_224.pth

报如下错误:

  File "/root/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/anaconda3/envs/pytorch/lib/python3.7/site-packages/mmcv/cnn/bricks/conv_module.py", line 203, in forward
    x = self.norm(x)
  File "/root/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/root/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 493, in forward
    world_size = torch.distributed.get_world_size(process_group)
  File "/root/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 620, in get_world_size
    return _get_group_size(group)
  File "/root/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 219, in _get_group_size
    _check_default_pg()
  File "/root/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/distributed/distributed_c10d.py", line 210, in _check_default_pg
    "Default process group is not initialized"
AssertionError: Default process group is not initialized
  1. 在configs/swin/cascade_mask_rcnn_swin_tiny_patch4_window7_mstrain_480-800_giou_4conv1f_adamw_1x_coco.py 的文件中,norm_cfg = dict(type=‘SyncBN’, requires_grad=True), ‘SyncBN’是采用distributed的训练方法,在单GPU non-distributed训练中使用会出现上述错误,

    将所有的type改为type=‘BN’ 即可.



版权声明:本文为weixin_44777827原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。