TensorFlow Note

Posted on 2019-02-15 Edited on 2023-10-19 In tensorflow

TensorFlow study note #1.

One | Basic

Graph

A Graph contains a set of tf.Operation objects, which represent units of computation; and tf.Tensor objects, which represent the units of data that flow between operations.

Graph包含一系列操作(op)和张量(tensor，op输出的标识符)，是一系列计算过程的合集。TensorFlow全局有一默认Graph，不需要手动创建。

# 在默认Graph下
m1 = tf.constant([[3,5],[3,5]])
m2 = tf.constant([[2,4],[2,4]])
result = tf.add(m1,m2)

# 创建新Graph g1
g1 = tf.Graph()
with g.as_default():
  m3 = tf.constant(30)

代码中默认Graph可表示为下图：tensor为边表示输出，op为节点表示操作。constant和add是op，constant创建了张量m1和m2，add将m1和m2相加并产生新张量。但Graph只定义了如何操作，并不会真正执行。

1
2
3

graph LR
m1(m1) --> add(add)
m2(m2) --> add

Session

A Session object encapsulates the environment in which Operation objects are executed, and Tensor objects are evaluated.

Session对话是定义tensor和执行op的环境，在Session中可指定要执行的Graph中的op。

# 默认Session使用默认Graph
with tf.Session() as sess:
    print(sess.run(result))   #输出： [[5 9] [5 9]]

# 指定使用g1
with tf.Session(graph=g1) as sess:
    print(sess.run(m3))   #输出： 30
    print(sess.run(m1))    #出错，g1中没有定义m1

# placeholder 和 run(feed_dict=)
a = tf.placeholder(tf.int16)
b = tf.placeholder(tf.int16)
add = tf.add(a, b)

with tf.Session() as sess:
    print("Addition with variables: %i" % sess.run(add, feed_dict={a: 2, b: 3}))

Op

An Operation is a node in a TensorFlow Graph that takes zero or more Tensor objects as input, and produces zero or more Tensor objects as output.

Op是Graph的节点，输入tensor输出tensor。

以下收集op笔记：

tf.constant

tensor = tf.constant([1, 2, 3]) => [1 2 3]

tensor = tf.constant(-1.0, shape=[2, 3]) => [[-1. -1. -1.]
                                             [-1. -1. -1.]]

m = tf.constant([[1,2,3],[4,5,6]], shape=[3,3], verify_shape=True)   # Error: shape

m = tf.constant([[1,2,3],[4,5,6]], shape=[3,3]) => [[1 2 3]
                                                    [4 5 6]
                                                    [6 6 6]]

tf.constant()是直接定义在graph里的，它是graph的一部分，会随着graph一起加载。如果通过tf.constant()定义了一个维度很高的张量，那么graph占用的内存就会变大，加载也会变慢。而tf.placeholder就没有这个问题，所以如果数据维度很高的话，定义成tf.placeholder是更好的选择。 –未求证

tf.placeholder

占位符，类似于声明。不指定shape可填充任意shape。

Its value must be fed using the feed_dict optional argument to Session.run(), Tensor.eval(), or Operation.run().

使用时必须用feed_dict选修填充数据。参考session

reduce_sum

通过求和降维。

判断要降哪一维：若M的shape为(6,7,8)，即678的三维矩阵，则alis=0,1,2分别得到(78),(68),(6*7)的二维矩阵，alis=[1,2]则先把7去掉，再把8去掉，最终得到(6)的一维向量，若再alis=[1,2,0]则6也去掉，从向量变成一个点（一个数字）。

# 官方例子
x = tf.constant([[1, 1, 1], [1, 1, 1]])
tf.reduce_sum(x)  # 6
tf.reduce_sum(x, 0)  # [2, 2, 2]
tf.reduce_sum(x, 1)  # [3, 3]
tf.reduce_sum(x, 1, keepdims=True)  # [[3], [3]]
tf.reduce_sum(x, [0, 1])  # 6

参数

alis(reduction_indices) - 要降低的维度，支持多个如[0,1] The dimensions to reduce. If None (the default), reduces all dimensions. Must be in the range [-rank(input_tensor), rank(input_tensor)).

tf.nn.softmax

归一化指数函数

使得每一个元素的范围都在(0,1)之间，并且所有元素的和为1。

Tensor

A Tensor is a symbolic handle to one of the outputs of an Operation. It does not hold the values of that operation’s output, but instead provides a means of computing those values in a TensorFlow tf.Session.

Tensor是对操作输出的标识处理，它并不代表输出值的本身，而是为Session提供计算那些值的手段。就是标识符的意思吧…

以下收集tensor笔记：

hint: 以下基本属于class，基于个人理解放在这一章。

tf.Variable

A tf.Variable represents a tensor whose value can be changed by running ops on it. Unlike tf.Tensor objects, a tf.Variable exists outside the context of a single session.run call.

Variable表示值可变的tensor，存在于session.run()上下文之外。（意思是全局变量？）

创建 W = tf.Variable(rng.randn(), name="weight")
在session中使用前需 sess.run(W.initializer)或 sess.run(tf.global_variables_initializer()) 初始化值，会通过创建语句的值（这里是rng.randn()随机数）初始化变量。
trainable = default-True

Optimizer

This class defines the API to add Ops to train a model.

优化器基类，定义训练模型的接口和操作。

补充资料：http://ruder.io/optimizing-gradient-descent/

GradientDescentOptimizer

梯度下降优化器 tf.train.GradientDescentOptimizer

Args:

learning_rate: A Tensor or a floating point value. 学习率
use_locking: If True use locks for update operations. 更新锁
name: Optional name prefix for the operations created when applying gradients. Defaults to “GradientDescent”.

.minimize() simply combines calls compute_gradients() and apply_gradients().

minimize包括了compute_gradients()和apply_gradients().