输入层 input layer
白化(预处理)
使得学习算法的输入具有如下性质
- 1.特征之间相关性较低
- 2.所有特征具有相同的方差。
卷积计算层conv layer
局部关联:局部数据识别
窗口滑动:滑动预先设定步长,移动位置来得到下一个窗口
深度:转换次数(结果产生的depth)
步长:设定每一移动多少
填充值:可以再矩阵的周边添加一些扩充值(目的是解决图片输入不规整)
激励层 ReLu layer
使用映射函数,来完成非线性的映射
(1)双s和s函数用于全连接层
(2)ReLu用于卷积计算层(迭代较快,只是效果不佳)
(3)普遍使用ELU
(4)Maxout:使用最大值来设置值
池化层 Polling layer
(1)最大池化
(2)平均池化
全连接层 FC
对于数据的汇总计算
Dropout(兼听则明)
1.不要CNN具有太多的泛化能力(不能以来某几个神经元)
2.多次迭代结果的合并可以增加模型的准确率
(相当于删除神经元后形成的不同的模型,多个不同的模型的合并可以提高他的准确率)
LeNet5
ResNet
残差连接:
允许模型存在一些shortcuts,可以让研究者成功训练更深的神经网络,这样也能明显的优化Inception块。
重要的视觉模型发展
AlexNet-》ZFnet->VGGNet->ResNet->MaskRCNN
1 | import numpy as np |
1 | mnist = input_data.read_data_sets('data/', one_hot=True) |
Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
MNIST ready
1 | n_input = 784 |
1 | def conv_basic(_input, _w, _b, _keepratio): |
CNN READY
1 | a = tf.Variable(tf.random_normal([3, 3, 1, 64], stddev=0.1)) |
<tf.Variable 'Variable_8:0' shape=(3, 3, 1, 64) dtype=float32_ref>
1 | #print (help(tf.nn.conv2d)) |
Help on function max_pool in module tensorflow.python.ops.nn_ops:
max_pool(value, ksize, strides, padding, data_format='NHWC', name=None)
Performs the max pooling on the input.
Args:
value: A 4-D `Tensor` with shape `[batch, height, width, channels]` and
type `tf.float32`.
ksize: A list of ints that has length >= 4. The size of the window for
each dimension of the input tensor.
strides: A list of ints that has length >= 4. The stride of the sliding
window for each dimension of the input tensor.
padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm.
See the @{tf.nn.convolution$comment here}
data_format: A string. 'NHWC' and 'NCHW' are supported.
name: Optional name for the operation.
Returns:
A `Tensor` with type `tf.float32`. The max pooled output tensor.
None
1 | x = tf.placeholder(tf.float32, [None, n_input]) |
GRAPH READY
1 | sess = tf.Session() |
Epoch: 000/015 cost: 30.928401661
Training accuracy: 0.500
Epoch: 001/015 cost: 12.954609606
Training accuracy: 0.700
Epoch: 002/015 cost: 10.392489696
Training accuracy: 0.700
Epoch: 003/015 cost: 7.254891634
Training accuracy: 0.800
Epoch: 004/015 cost: 4.977767670
Training accuracy: 0.900
Epoch: 005/015 cost: 5.414173813
Training accuracy: 0.600
Epoch: 006/015 cost: 3.057567777
Training accuracy: 0.700
Epoch: 007/015 cost: 4.929724103
Training accuracy: 0.600
Epoch: 008/015 cost: 3.192437538
Training accuracy: 0.600
Epoch: 009/015 cost: 3.224479928
Training accuracy: 0.800
Epoch: 010/015 cost: 2.720530389
Training accuracy: 0.400
Epoch: 011/015 cost: 3.000342276
Training accuracy: 0.800
Epoch: 012/015 cost: 0.639763238
Training accuracy: 1.000
Epoch: 013/015 cost: 1.897303332
Training accuracy: 0.900
Epoch: 014/015 cost: 2.295500937
Training accuracy: 0.800
OPTIMIZATION FINISHED
1 |