pytorch迁移GPU遇到的BUG整理

Post author:xfxia
Post published:2023年7月19日
Post category:其他

CPU跑的好好的，迁移到GPU后，BUG一个接着一个

Expected object of device type cuda but got device type cpu for argument

Expected object of backend CUDA but got backend CPU for sequence element 0 in sequence argument at position #1 ‘tensors’

以上都是输入数据或者变量的问题，仔细检查代码、padding时的变量是否迁移等，查不出可以debug，就是慢…

除了这些 https://www.jb51.net/article/213830.htm

还有模型新建的内部张量（模型迁移了，内部张量就会迁移）

但是我自己改了模型，遇到数据类型不匹配的问题

Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

排查后发现是构建模型时用到字典dict 不会自动迁移到GPU

以前的写法：

self.d_tempconv = {}
for window_size in self.window_sizes:
      self.d_tempconv[window_size] = TemporalConvoluation(self.cov_dim, self.mem_dim, window_size)  
tempconv = self.d_tempconv[window_size](input)

修改后

self.d_tempconv = {}
for window_size in self.window_sizes:
       self.d_tempconv[window_size] = TemporalConvoluation(self.cov_dim, self.mem_dim, window_size)  
self.d_tempconv_new = nn.ModuleDict({str(key):value for key,value in self.d_tempconv.items()})
tempconv = self.d_tempconv_new[str(window_size)](input)

具体可参考https://blog.csdn.net/luo3300612/article/details/97815207

原文链接：https://blog.csdn.net/weixin_42164041/article/details/121720435

你可能也喜欢