错误信息:
RuntimeError: Input and hidden tensors are not at the same device, found input tensor at cuda:0 and hidden tensor at cpu
错误提示说,输入的张量在cuda中,但模型的隐藏层在CPU中,但我们检查模型,发现模型已经被放入到了cuda中了
# 实例化模型
model = LSTM(input_dim=input_dim, hidden_dim=hidden_dim, output_dim=output_dim, num_layers=num_layers)
# GPU加速
if torch.cuda.is_available():
device = 'cuda:0'
model = model.to(device)
trainX = trainX.to(device)
trainY = trainY.to(device)
testX = testX.to(device)
testY = testY.to(device)
究竟问题出在哪里了呢?
这时我们检查一下模型类
input_dim = 5 # 数据的特征数
hidden_dim = 32 # 隐藏层的神经元个数
num_layers = 2 # LSTM的层数
output_dim = 1 # 预测值的特征数
class LSTM(nn.Module):
def __init__(self, input_dim, hidden_dim, num_layers, output_dim):
super(LSTM, self).__init__()
# Hidden dimensions
self.hidden_dim = hidden_dim
# Number of hidden layers
self.num_layers = num_layers
# Building your LSTM
# batch_first=True causes input/output tensors to be of shape (batch_dim, seq_dim, feature_dim)
self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True)
# Readout layer 在LSTM后再加一个全连接层,因为是回归问题,所以不能在线性层后加激活函数
self.fc = nn.Linear(hidden_dim, output_dim)
def forward(self, x):
# Initialize hidden state with zeros
h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_()
# 这里x.size(0)就是batch_size
# Initialize cell state
c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim).requires_grad_()
# One time step
# We need to detach as we are doing truncated backpropagation through time (BPTT)
# If we don't, we'll backprop all the way to the start even after going through another batch
out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
out = self.fc(out)
return out
发现在forward函数中,出现了h0和c0两个变量,这是给LSTM最开始的时步用的,这两个变量却是在CPU中,但它们不是模型的参数,即便后面使用了
model = model.to(device)
命令,也无法将这两个变量放入到GPU中。
将上述程序改成
if torch.cuda.is_available():
device = 'cuda:0'
trainX = trainX.to(device)
trainY = trainY.to(device)
testX = testX.to(device)
testY = testY.to(device)
input_dim = 6 # 数据的特征数
hidden_dim = 32 # 隐藏层的神经元个数
num_layers = 2 # LSTM的层数
output_dim = 1 # 预测值的特征数
class LSTM(nn.Module):
def __init__(self, input_dim, hidden_dim, num_layers, output_dim):
super(LSTM, self).__init__()
# Hidden dimensions
self.hidden_dim = hidden_dim
# Number of hidden layers
self.num_layers = num_layers
# Building your LSTM
# batch_first=True causes input/output tensors to be of shape (batch_dim, seq_dim, feature_dim)
self.lstm = nn.LSTM(input_dim, hidden_dim, num_layers, batch_first=True)
# Readout layer 在LSTM后再加一个全连接层,因为是回归问题,所以不能在线性层后加激活函数
self.fc = nn.Linear(hidden_dim, output_dim)
def forward(self, x):
# Initialize hidden state with zeros
h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim, device=x.device).requires_grad_()
# 这里x.size(0)就是batch_size
# Initialize cell state
c0 = torch.zeros(self.num_layers, x.size(0), self.hidden_dim, device=x.device).requires_grad_()
# One time step
# We need to detach as we are doing truncated backpropagation through time (BPTT)
# If we don't, we'll backprop all the way to the start even after going through another batch
out, (hn, cn) = self.lstm(x, (h0.detach(), c0.detach()))
out = self.fc(out)
return out
程序可以正常运行。
版权声明:本文为weixin_44457930原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。