tensorflow 中的embedding 报错问题解决

  • Post author:
  • Post category:其他


今天TensorFlow中的tf.contrib.layers.embed_sequence来对输入进行embedding时候,发现报了如下的错误:

InvalidArgumentError (see above for traceback): indices[1,2] = 6 is not in [0, 6)
	 [[Node: EmbedSequence_8/embedding_lookup = Gather[Tindices=DT_INT32, Tparams=DT_FLOAT, _class=["loc:@EmbedSequence_8/embeddings"], validate_indices=true, _dev
ice="/job:localhost/replica:0/task:0/cpu:0"](EmbedSequence_8/embeddings/read, EmbedSequence_8/embedding_lookup/indices)]]

后来就找了一些资料,寻求解决方法,发现是vocab_size要加1才可以,因为vocab_to_int是从1开始编码的,0留作评论单词不足的补全位,代码如下:

   
import tensorflow as tf

features = [[1,2,3],[4,5,6]]
n_words=6
outputs = tf.contrib.layers.embed_sequence(features, vocab_size=n_words+1, embed_dim=4)

with tf.Session(config=config) as sess:
    sess.run(tf.global_variables_initializer())
    a=sess.run(outputs)
    print(a)
   

词向量结果如下:

[[[ 0.42639822 -0.45257723  0.44895023  0.17683214]
  [ 0.68834776  0.25755352  0.18518716 -0.36953419]
  [-0.20138246 -0.35034212  0.44844049  0.3326121 ]]

 [[-0.55106479 -0.64119202 -0.06463015 -0.68032914]
  [ 0.58467633  0.58155423  0.63106912  0.17282218]
  [ 0.46636218 -0.73744893  0.38337153  0.64258808]]]



版权声明:本文为momaojia原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。