chebyshevGCN的卷积核大小是什么?
Defferrard 及其团队,在NIPS.2016的上的文章 《Convolutional neural networks on graph with fast localized spectral fiftering》是学习GCN的入门必读文章。文章的核心要点是将谱域GCN的卷积核,使用chebyshev多项式截断,然后经过迭代,绕开了原有对Laplacian的特征分解所需要的大量计算。因此计算复杂度大量下降,加快了卷积的过程。
原文中给出的代码使用了多个类相互继承,代码较为复杂,我将原始代码改为一个简单类,使得代码更为简洁,详细内容参考,我的github:https://github.com/leesendy/neu_gcn 如果觉得赞给个星星呗。
chebyNet中卷积核研究
代码如下:
def chebyshev5(self, x, L, Fout, K):
N, M, Fin = x.get_shape()
N, M, Fin = int(N), int(M), int(Fin)
# Rescale Laplacian and store as a TF sparse tensor. Copy to not modify the shared L.
L = scipy.sparse.csr_matrix(L)
L = graph.rescale_L(L, lmax=2)
L = L.tocoo()
indices = np.column_stack((L.row, L.col))
L = tf.SparseTensor(indices, L.data, L.shape)
L = tf.sparse_reorder(L)
# Transform to Chebyshev basis
x0 = tf.transpose(x, perm=[1, 2, 0]) # M x Fin x N
x0 = tf.reshape(x0, [M, Fin * N]) # M x Fin*N
x = tf.expand_dims(x0, 0) # 1 x M x Fin*N
def concat(x, x_):
x_ = tf.expand_dims(x_, 0) # 1 x M x Fin*N
return tf.concat([x, x_], axis=0) # K x M x Fin*N
if K > 1:
x1 = tf.sparse_tensor_dense_matmul(L, x0)
x = concat(x, x1)
for k in range(2, K):
x2 = 2 * tf.sparse_tensor_dense_matmul(L, x1) - x0 # M x Fin*N
x = concat(x, x2)
x0, x1 = x1, x2
x = tf.reshape(x, [K, M, Fin, N]) # K x M x Fin x N
x = tf.transpose(x, perm=[3, 1, 2, 0]) # N x M x Fin x K
x = tf.reshape(x, [N * M, Fin * K]) # N*M x Fin*K
# Filter: Fin*Fout filters of order K, i.e. one filterbank per feature pair.
W = self._weight_variable([Fin * K, Fout], regularization=False)
x = tf.matmul(x, W) # N*M x Fout
return tf.reshape(x, [N, M, Fout]) # N x M x Fout
解读:
x:输入特征
size:(N,M,Fin) 分别代表为:batch_size,num_of_node, length_of_feature_per_node
L: 拉普拉斯矩阵
size:(M,M)
W:卷积核
size:(FinxK, Fout)Fout代表输出数据大小,类似dense中输出维度
流程:
首先对x维度进行变换,配合和L做矩阵乘法
得到结果和x进行拼接(axis=0)
若K>1,再次和L进行矩阵乘法,减去上次乘法结果,再进行拼接,大小为(N,M,Fin,K)
和卷积核W相乘,得到卷积结果,大小(N,M,Fout)
1st-chebyshevGCN/GCN的卷积核大小是什么?
Tomas.Kipf 提出的GCN被认为GCN实用化的开始,有人称之为1st-ChebyshevGCN。文章中将其chebyshev多项式截断近似为K=1,因此将原有的K项求和变为2项求和,计算速度大为提升。
GCN卷积核研究:
源代码如下
class GraphConvolution(Layer):
"""Graph convolution layer."""
def __init__(self, input_dim, output_dim, placeholders, dropout=0.,
sparse_inputs=False, act=tf.nn.relu, bias=False,
featureless=False, **kwargs,):
super(GraphConvolution, self).__init__(**kwargs)
if dropout:
self.dropout = placeholders['dropout']
else:
self.dropout = 0.
self.act = act
self.support = placeholders['support']
self.sparse_inputs = sparse_inputs
self.featureless = featureless
self.bias = bias
# helper variable for sparse dropout
self.num_features_nonzero = placeholders['num_features_nonzero']
with tf.variable_scope(self.name + '_vars'):
for i in range(len(self.support)):
self.vars['weights_' + str(i)] = glorot([input_dim, output_dim],
name='weights_' + str(i))
if self.bias:
self.vars['bias'] = zeros([output_dim], name='bias')
if self.logging:
self._log_vars()
def _call(self, inputs):
x = inputs
# dropout
if self.sparse_inputs:
x = sparse_dropout(x, 1-self.dropout, self.num_features_nonzero)
else:
x = tf.nn.dropout(x, 1-self.dropout)
# convolve
supports = list()
for i in range(len(self.support)):
if not self.featureless:
pre_sup = dot(x, self.vars['weights_' + str(i)],
sparse=self.sparse_inputs)
else:
pre_sup = self.vars['weights_' + str(i)]
support = dot(self.support[i], pre_sup, sparse=True)
supports.append(support)
output = tf.add_n(supports)
mean, var = tf.nn.moments(output, [0,1], name='monments' )
output = tf.nn.batch_normalization(output, mean, var, offset=None, scale=None, variance_epsilon=1e-5)
# bias
if self.bias:
output += self.vars['bias']
print(output.shape)
print(type(output))
# output = tf.contrib.layers.batch_norm(inputs=output, decay=0.999, center=False, scale=False, epsilon=1e-3,
# updates_collections=None,is_training=self.phase_train,reuse=False,
# fused=False)
# mean, var = tf.nn.moments(output, [0, 1], name='moments')
# output = tf.nn.batch_normalization(output, mean, var, offset = None, scale = None, variance_epsilon=1e-5)
return self.act(output)
解读:
x:输入特征矩阵(M x input_dim) M为图节点个数,input_dim为输入节点特征长度
W:卷积核(input_dim, output_dim)此处应该注意的是,其个数为K个,这里的K对应的是K-hop
L:邻接矩阵(M x M)此处邻接矩阵的个数为K个,K的意义同上
运行流程:
1.特征矩阵左乘卷积核W1,W2,W3…Wk
2.得到结果(W1X, W2X,W3X…WkX)
3.上述结果左乘,不同K-hop下一系列矩阵L1,L2,L3…Lk
4.得到结果(L1W1X,L2W2X,L3W3X…LkWkX)
5.求和(L1W1X + L2W2X + L3W3X + …+ LkWkX),结果大小为(M x Fout)
感想:
A.事实说明,chebyshev-GCN,确实是将CNN引入图领域的一个进步,其本质还是卷积,只是在卷积前,特征/数据同拉普拉斯矩阵进行了多次乘法运算,其中由于K的存在,使得多空间下的特征在一个卷积中实现,因此增强了GCN对数据特征的抽取。
B.1st-ChebyshevGCN将原有的大卷积核拆分成多‘通道’(类比想法未必准确),可以显著降低卷积核大小,加快收敛速度。
C.node-level的任务和graph-level的任务本质并无太大区别,定义关键节点,抽取有效的节点特征和边链接意义才是问题的关键。若进行graph-level的任务,图之间的特征投影差别更有可能源于节点特征的本身差异。若进行node-level的任务,边链接则更重要。