TensorFlow的checkpoint机制使得其能够同时支持Online Learning和Continuous Learning,首先,通过tf.train.Saver()将训练好的或者训练过程中的模型保存成checkpoint:
_, loss_value, step = sess.run([train_op, loss, global_step])
saver.save(sess,"./checkpoint/checkpoint.ckpt", global_step=step)
然后通过restore()函数从本地的checkpoint文件中恢复模型,当然也可以从该点开始继续运行,也就是所谓的Continuous Learning:
ckpt = tf.train.get_checkpoint_state("./checkpoint/")
if ckpt and ckpt.model_checkpoint_path:
print("Continue training from the model {}".format(ckpt.model_checkpoint_path))
saver.restore(sess, ckpt.model_checkpoint_path)
_, loss_value, step = sess.run([train_op, loss, global_step])
最后通过tf.trainable_variables()获取返回模型中所训练的参数:
for var in tf.trainable_varisbles():
print var.name