TensorFlow Note

TensorFlow study note #1.

One | Basic

Graph

A Graph contains a set of tf.Operation objects, which represent units of computation; and tf.Tensor objects, which represent the units of data that flow between operations.

Graph包含一系列操作(op)和张量(tensor,op输出的标识符),是一系列计算过程的合集。TensorFlow全局有一默认Graph,不需要手动创建。

1
2
3
4
5
6
7
8
9
# 在默认Graph下
m1 = tf.constant([[3,5],[3,5]])
m2 = tf.constant([[2,4],[2,4]])
result = tf.add(m1,m2)

# 创建新Graph g1
g1 = tf.Graph()
with g.as_default():
m3 = tf.constant(30)

代码中默认Graph可表示为下图:tensor为边表示输出,op为节点表示操作。constantadd是op,constant创建了张量m1和m2,add将m1和m2相加并产生新张量。但Graph只定义了如何操作,并不会真正执行。

1
2
3
graph LR
m1(m1) --> add(add)
m2(m2) --> add

Session

A Session object encapsulates the environment in which Operation objects are executed, and Tensor objects are evaluated.

Session对话是定义tensor和执行op的环境,在Session中可指定要执行的Graph中的op。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# 默认Session使用默认Graph
with tf.Session() as sess:
print(sess.run(result)) #输出: [[5 9] [5 9]]

# 指定使用g1
with tf.Session(graph=g1) as sess:
print(sess.run(m3)) #输出: 30
print(sess.run(m1)) #出错,g1中没有定义m1

# placeholder 和 run(feed_dict=)
a = tf.placeholder(tf.int16)
b = tf.placeholder(tf.int16)
add = tf.add(a, b)

with tf.Session() as sess:
print("Addition with variables: %i" % sess.run(add, feed_dict={a: 2, b: 3}))

Op

An Operation is a node in a TensorFlow Graph that takes zero or more Tensor objects as input, and produces zero or more Tensor objects as output.

Op是Graph的节点,输入tensor输出tensor。

以下收集op笔记

tf.constant

1
2
3
4
5
6
7
8
9
10
tensor = tf.constant([1, 2, 3]) => [1 2 3]

tensor = tf.constant(-1.0, shape=[2, 3]) => [[-1. -1. -1.]
[-1. -1. -1.]]

m = tf.constant([[1,2,3],[4,5,6]], shape=[3,3], verify_shape=True) # Error: shape

m = tf.constant([[1,2,3],[4,5,6]], shape=[3,3]) => [[1 2 3]
[4 5 6]
[6 6 6]]

tf.constant()是直接定义在graph里的,它是graph的一部分,会随着graph一起加载。如果通过tf.constant()定义了一个维度很高的张量,那么graph占用的内存就会变大,加载也会变慢。而tf.placeholder就没有这个问题,所以如果数据维度很高的话,定义成tf.placeholder是更好的选择。 –未求证

tf.placeholder

占位符,类似于声明。不指定shape可填充任意shape。

Its value must be fed using the feed_dict optional argument to Session.run(), Tensor.eval(), or Operation.run().

使用时必须用feed_dict选修填充数据。参考session

reduce_sum

通过求和降维。

判断要降哪一维:若M的shape为(6,7,8),即678的三维矩阵,则alis=0,1,2分别得到(78),(68),(6*7)的二维矩阵,alis=[1,2]则先把7去掉,再把8去掉,最终得到(6)的一维向量,若再alis=[1,2,0]则6也去掉,从向量变成一个点(一个数字)。

1
2
3
4
5
6
7
# 官方例子
x = tf.constant([[1, 1, 1], [1, 1, 1]])
tf.reduce_sum(x) # 6
tf.reduce_sum(x, 0) # [2, 2, 2]
tf.reduce_sum(x, 1) # [3, 3]
tf.reduce_sum(x, 1, keepdims=True) # [[3], [3]]
tf.reduce_sum(x, [0, 1]) # 6

参数

  • alis(reduction_indices) - 要降低的维度,支持多个如[0,1] The dimensions to reduce. If None (the default), reduces all dimensions. Must be in the range [-rank(input_tensor), rank(input_tensor)).

tf.nn.softmax

归一化指数函数

使得每一个元素的范围都在(0,1)之间,并且所有元素的和为1。


Tensor

A Tensor is a symbolic handle to one of the outputs of an Operation. It does not hold the values of that operation’s output, but instead provides a means of computing those values in a TensorFlow tf.Session.

Tensor是对操作输出的标识处理,它并不代表输出值的本身,而是为Session提供计算那些值的手段。就是标识符的意思吧…

以下收集tensor笔记


hint: 以下基本属于class,基于个人理解放在这一章。

tf.Variable

A tf.Variable represents a tensor whose value can be changed by running ops on it. Unlike tf.Tensor objects, a tf.Variable exists outside the context of a single session.run call.

Variable表示值可变的tensor,存在于session.run()上下文之外。 (意思是全局变量?)

  • 创建 W = tf.Variable(rng.randn(), name="weight")

  • 在session中使用前需 sess.run(W.initializer)sess.run(tf.global_variables_initializer()) 初始化值,会通过创建语句的值(这里是rng.randn()随机数)初始化变量。

  • trainable = default-True


Optimizer

This class defines the API to add Ops to train a model.

优化器 基类,定义训练模型的接口和操作。

补充资料:http://ruder.io/optimizing-gradient-descent/

GradientDescentOptimizer

梯度下降优化器 tf.train.GradientDescentOptimizer

Args:

  • learning_rate: A Tensor or a floating point value. 学习率
  • use_locking: If True use locks for update operations. 更新锁
  • name: Optional name prefix for the operations created when applying gradients. Defaults to “GradientDescent”.

.minimize() simply combines calls compute_gradients() and apply_gradients().

minimize包括了compute_gradients()和apply_gradients().