tensorflow入门之mnist手写数据集识别

  • Post author:
  • Post category:其他


最近开始研究机器学习,整个模型都自己写的话不太现实,所以还是得用框架。几经查找,选择了Google的Tensorflow框架,这个起步也还比较好用,网上参考资料也很多。

参考的教程官网:

tensorflow


环境安装的话,我用的python3.6,这里强烈推荐安装个Anaconda,python包管理工具,用起来特别方便,切换python版本什么的也很简单。python库管理也是都可视化的。

在官网下载Anaconda安装包,安装完成之打开Navigator,如下:

anaconda 安装tensorflow

搜索框输入tensorflow然后直接安装就行。

IDE我用的VSCode,安装一下python插件就可以用了。

然后开始入门的教程学习:

首先数据集要从mnist网站上下下来。下面是链接


mnist数据集下载地址


train-images-idx3-ubyte.gz: 训练数据集手写数字图片

train-labels-idx1-ubyte.gz: 训练数据集标签(对应于图片的答案)

t10k-images-idx3-ubyte.gz: 测试数据集图片

t10k-labels-idx1-ubyte.gz: 测试数据集标签

下载下来解压,这是用python struct打包了的byte文件,我们需要用代码再把它解析出来然后转成向量数组便于tensorflow引用,可以在教程上看到,图片集需要转成[60000,784]的矩阵,一个[,784]代表一张图片,即28*28展开成一维数组,这就是训练样本x,初步的计算公式就是y = Wx +b。W是权重,这个就是要不断训练得出来的最优解,b是偏移量,这些都是入门所需知识,在此便不多加赘述。

数据集解析比较麻烦,下面直接贴代码:


    def read_train_image(self,filename):
        index = 0
        binfile = open(filename,'rb')
        buf = binfile.read()
        magic, self.train_img_num, self.numRows,self.numColums = struct.unpack_from('>IIII',buf,index)
        self.train_img_list = np.zeros((self.train_img_num, 28 * 28))
        index += struct.calcsize('>IIII')
        # print (magic, ' ', self.train_img_num, ' ', self.numRows, ' ', self.numColums)

        for i in range(self.train_img_num):
            im = struct.unpack_from('>784B',buf,index)
            index += struct.calcsize('>784B')
            im = np.array(im)
            # print(im)
            im = im/255
            im = im.reshape(1,28*28)
            # im = im.reshape(28,28)
            self.train_img_list[i,:] = im
            # plt.imshow(im,cmap='binary')
            # plt.show()
    def read_train_lable(self,filename):
        index = 0
        binfile = open(filename,'rb')
        buf = binfile.read()
        magic, self.train_label_num = struct.unpack_from('>II',buf,index)
        self.train_label_list = np.zeros((self.train_label_num, 10))
        index += struct.calcsize('>II')
        # print(magic, self.train_label_num)
        for i in range(self.train_label_num):
            lblTemp = np.zeros(10)
            lbl = struct.unpack_from('>1B',buf,index)
            index += struct.calcsize('>1B')
            lbl = np.array(lbl)
            lblTemp[lbl[0]] = 1
            self.train_label_list[i,:] = lblTemp
            # print(lblTemp)

    def next_batch_image(self,batchCount):
        rnd = np.random.randint(1,60000)
        return self.train_img_list[rnd:rnd+batchCount],self.train_label_list[rnd:rnd+batchCount]

    def read_test_image(self,filename):
        index = 0
        binfile = open(filename,'rb')
        buf = binfile.read()
        magic, self.test_img_num, self.numRows,self.numColums = struct.unpack_from('>IIII',buf,index)
        self.test_img_list = np.zeros((self.test_img_num, 28 * 28))
        index += struct.calcsize('>IIII')
        # print (magic, ' ', self.test_img_num, ' ', self.numRows, ' ', self.numColums)

        for i in range(self.test_img_num):
            im = struct.unpack_from('>784B',buf,index)
            index += struct.calcsize('>784B')
            im = np.array(im)
            im = im/255
            im = im.reshape(1,28*28)
            # im = im.reshape(28,28)
            self.test_img_list[i,:] = im

    def read_test_lable(self,filename):
        index = 0
        binfile = open(filename,'rb')
        buf = binfile.read()
        magic, self.test_label_num = struct.unpack_from('>II',buf,index)
        self.test_label_list = np.zeros((self.test_label_num, 10))
        index += struct.calcsize('>II')
        # print(magic, self.test_label_num)  #train
        for i in range(self.test_label_num):
            lblTemp = np.zeros(10)
            lbl = struct.unpack_from('>1B',buf,index)
            index += struct.calcsize('>1B')
            lbl = np.array(lbl)
            lblTemp[lbl[0]] = 1
            self.test_label_list[i,:] = lblTemp
            # print(lblTemp)

其中四个方法分别返回四个数据集展开来的向量。因为数据样本有限以及模型比较简单,所以教程采用了随机训练(随机梯度下降训练),所以有个next_batch方法是返回随机100个(连续)样本。这里也可以改进成不连续的。

然后就是tensorflow这边的训练模型代码:



def tfOperate():

    filename_t_image  = "D:\\PY_Image\\handnum\\train-images.idx3-ubyte"
    filename_t_label  = "D:\\PY_Image\\handnum\\train-labels.idx1-ubyte"
    filename_test_image  = "D:\\PY_Image\\handnum\\train-images.idx3-ubyte"
    filename_test_label  = "D:\\PY_Image\\handnum\\train-labels.idx1-ubyte"
    t = DP()
    t.read_train_image(filename_t_image)
    t.read_train_lable(filename_t_label)
    t.read_test_image(filename_test_image)
    t.read_test_lable(filename_test_label)

    # 训练样本image  placeholder  是 n*784  
    x = tf.placeholder("float",[None,784])
    #权重
    W = tf.Variable(tf.zeros([784,10]))
    # bias
    b = tf.Variable(tf.zeros([10]))

    #训练模型   softmax
    y =  tf.nn.softmax(tf.matmul(x,W) + b)

#   交叉熵  cost  or  loss
    y_ = tf.placeholder("float", [None,10])
    cross_entropy = -tf.reduce_sum(y_*tf.log(y))

    # 梯度下降算法  最小化交叉熵
    train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

    # init = tf.initialize_all_variables()
    init = tf.global_variables_initializer()

    sess = tf.Session()
    sess.run(init)

    for i in range (3000):
        batch_xs,batch_ys = t.next_batch_image(100)
        # print(batch_ys.shape)
        sess.run(train_step,feed_dict = {x:batch_xs,y_:batch_ys})
        if (i%500 == 0 and i >0):
            correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
            print (sess.run(accuracy, feed_dict={x: t.test_img_list, y_: t.test_label_list}))

最后三句是用来测试识别成功率,大概是0.91左右。

最后顺便说一句,想要把这几个公式都看懂的话,最起码得入门线代,概率论和微积分三门课程。

线代可以学习MIT Gilbert Strang教授的线性代数公开课,网易公开课上有带字幕的。另外俩这上面也有公开课,看个人需求学习,博客写的有点糙。。主要就是打个笔记。

路还很长。。慢慢学。

数据集解析参考链接:


http://blog.csdn.net/supercally/article/details/54236658



版权声明:本文为cl6348原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。