TensorFlow-Examples-LOR

Posted on 2019-02-22 Edited on 2023-10-19 In tensorflow

从TensorFlow-Examples学习TensorFlow之二：logistic_regression。进度

BasicModels: logistic_regression.py

逻辑回归是一种广义线性回归。

逻辑回归同样使用线性预测方程，只是线性回归使用方程因变量作结果，而逻辑回归通过激活函数处理，将因变量转换为另一种输出；因此其与线性回归的代码结构很类似，最本质的区别可能就是，线性回归处理线性问题，而逻辑回归处理分类问题。如：

已知N张数字图片和每张图片中写的数字是’0’的概率，则可将图片像素点作为自变量x，‘0’概率作为因变量y，使用线性回归去预测y=W*x+b。
已知N张数字图片和每张图片写的数字，因变量变成了数字，不是计算机从像素点能直接得出的结果，所以数字不应该作为y。所以先构造一个y=W*x+b，这时候y应该有10个值分别表示图片可能为i的预测值，再根据预测值比较（如最大）得出预测结果，即逻辑回归的思想。

在这个Example中，同样出现了线性回归样例中将测试集拆分代入优化器的情况，但是在这例中稍微能够明白这样做的原因：
- 在训练集固定的情况下拆分子集，能增加训练次数的同时防止模型与整个训练集过拟合。
- 参考：当激活函数不是0均值（即zero-centered）时，会导致后一层的神经元将得到上一层输出的非0均值的信号作为输入。产生的一个结果就是在反向传播的过程中w要么都往正方向更新，要么都往负方向更新，导致有一种捆绑的效果，使得收敛缓慢。当然了，如果按batch去训练，那么那个batch可能得到不同的信号，所以这个问题还是可以缓解一下的。

logistic_regression.py解读，代码同步更新

'''
A logistic regression learning algorithm example using TensorFlow library.
This example is using the MNIST database of handwritten digits
(http://yann.lecun.com/exdb/mnist/)

Author: Aymeric Damien
Project: https://github.com/aymericdamien/TensorFlow-Examples/
'''
from __future__ import print_function

import tensorflow as tf

# Import MNIST data
# 本例使用MNIST，数字识别数据集
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

# Parameters
learning_rate = 0.01
training_epochs = 25
batch_size = 100
display_step = 1

# tf Graph Input
# 数据集每个图片是28*28像素，结果是0-9共10个数字
# None保留，可以是任何数，在这一例中后续作为每批测试数据数量(batch_size)输入
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes

# Set model weights
# 创建变量
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Construct model
# 回归方程 Y = x·W + b ， 可判断出Y为shape为(None, 10)的张量
# 预测函数 pred = softmax(Y) ，对Y应用归一化指数函数，使输出概率化，
# 得到的结果看作0-9数字的预测概率，取最大值则为预测结果。
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax

# Minimize error using cross entropy
# 使用交叉熵作代价函数
# 知y(None,10),pred(None,10), y*tf.log(pred)对应位相乘得到(None,10)的张量；
# reduce_sum对维度alix=1求和降维，效果就是sum(p*log(1/q))，得到(None)个交叉熵，
# 即每一个单独训练集的交叉熵。
# reduce_mean对(None)个交叉熵求平均值，得到这一次训练的平均交叉熵，并以此为代价进行优化。
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))
# Gradient Descent
# 对代价函数cost使用梯度下降优化W,b变量
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initialize the variables (i.e. assign their default value)
init = tf.global_variables_initializer()

# Start training
with tf.Session() as sess:

    # Run the initializer
    sess.run(init)

    # Training cycle
    # 进行training_epochs次训练
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        # 每次取batch_size条数据，共需total_batch次
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Run optimization op (backprop) and cost op (to get loss value)
            # 执行训练，并记录每次训练得到的cost以计算整个训练集的平均cost
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs,
                                                          y: batch_ys})
            # Compute average loss 平均cost
            avg_cost += c / total_batch

        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost))

    print("Optimization Finished!")

    # Test model
    # argmax返回pred在维度alis=1上最大值的下标，pred是(None,10)，所以得到(None)个预测值（概率最高）
    # correct_prediction就是(None)个bool，是否预测正确
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    # Calculate accuracy 准确率
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    # 将MNIST的测试数据代入，计算准确率
    # .eval 与 sess.run(accuracy)相同
    print("Accuracy:", accuracy.eval({x: mnist.test.images, y: mnist.test.labels}))