CNN卷积神经网络

2017-07-13 深度学习 PV:

输入层 input layer

白化（预处理）
使得学习算法的输入具有如下性质

1.特征之间相关性较低
2.所有特征具有相同的方差。

卷积计算层conv layer

局部关联：局部数据识别
窗口滑动：滑动预先设定步长，移动位置来得到下一个窗口
深度：转换次数（结果产生的depth）
步长：设定每一移动多少
填充值：可以再矩阵的周边添加一些扩充值（目的是解决图片输入不规整）

激励层 ReLu layer

使用映射函数，来完成非线性的映射
（1）双s和s函数用于全连接层
（2）ReLu用于卷积计算层(迭代较快，只是效果不佳)
（3）普遍使用ELU
（4）Maxout：使用最大值来设置值

池化层 Polling layer

（1）最大池化
（2）平均池化

全连接层 FC

对于数据的汇总计算

Dropout（兼听则明）

1.不要CNN具有太多的泛化能力（不能以来某几个神经元）
2.多次迭代结果的合并可以增加模型的准确率
（相当于删除神经元后形成的不同的模型，多个不同的模型的合并可以提高他的准确率）

LeNet5

ResNet
残差连接：
允许模型存在一些shortcuts，可以让研究者成功训练更深的神经网络，这样也能明显的优化Inception块。

重要的视觉模型发展

AlexNet-》ZFnet->VGGNet->ResNet->MaskRCNN

import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import input_data

mnist = input_data.read_data_sets('data/', one_hot=True)
trainimg   = mnist.train.images
trainlabel = mnist.train.labels
testimg    = mnist.test.images
testlabel  = mnist.test.labels
print ("MNIST ready")

Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
MNIST ready

n_input  = 784
n_output = 10
weights  = {
        'wc1': tf.Variable(tf.random_normal([3, 3, 1, 64], stddev=0.1)),
        'wc2': tf.Variable(tf.random_normal([3, 3, 64, 128], stddev=0.1)),
        'wd1': tf.Variable(tf.random_normal([7*7*128, 1024], stddev=0.1)),
        'wd2': tf.Variable(tf.random_normal([1024, n_output], stddev=0.1))
    }
biases   = {
        'bc1': tf.Variable(tf.random_normal([64], stddev=0.1)),
        'bc2': tf.Variable(tf.random_normal([128], stddev=0.1)),
        'bd1': tf.Variable(tf.random_normal([1024], stddev=0.1)),
        'bd2': tf.Variable(tf.random_normal([n_output], stddev=0.1))
    }

def conv_basic(_input, _w, _b, _keepratio):
        # INPUT
        _input_r = tf.reshape(_input, shape=[-1, 28, 28, 1])
        # CONV LAYER 1
        _conv1 = tf.nn.conv2d(_input_r, _w['wc1'], strides=[1, 1, 1, 1], padding='SAME')
        #_mean, _var = tf.nn.moments(_conv1, [0, 1, 2])
        #_conv1 = tf.nn.batch_normalization(_conv1, _mean, _var, 0, 1, 0.0001)
        _conv1 = tf.nn.relu(tf.nn.bias_add(_conv1, _b['bc1']))
        _pool1 = tf.nn.max_pool(_conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
        _pool_dr1 = tf.nn.dropout(_pool1, _keepratio)
        # CONV LAYER 2
        _conv2 = tf.nn.conv2d(_pool_dr1, _w['wc2'], strides=[1, 1, 1, 1], padding='SAME')
        #_mean, _var = tf.nn.moments(_conv2, [0, 1, 2])
        #_conv2 = tf.nn.batch_normalization(_conv2, _mean, _var, 0, 1, 0.0001)
        _conv2 = tf.nn.relu(tf.nn.bias_add(_conv2, _b['bc2']))
        _pool2 = tf.nn.max_pool(_conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
        _pool_dr2 = tf.nn.dropout(_pool2, _keepratio)
        # VECTORIZE
        _dense1 = tf.reshape(_pool_dr2, [-1, _w['wd1'].get_shape().as_list()[0]])
        # FULLY CONNECTED LAYER 1
        _fc1 = tf.nn.relu(tf.add(tf.matmul(_dense1, _w['wd1']), _b['bd1']))
        _fc_dr1 = tf.nn.dropout(_fc1, _keepratio)
        # FULLY CONNECTED LAYER 2
        _out = tf.add(tf.matmul(_fc_dr1, _w['wd2']), _b['bd2'])
        # RETURN
        out = { 'input_r': _input_r, 'conv1': _conv1, 'pool1': _pool1, 'pool1_dr1': _pool_dr1,
            'conv2': _conv2, 'pool2': _pool2, 'pool_dr2': _pool_dr2, 'dense1': _dense1,
            'fc1': _fc1, 'fc_dr1': _fc_dr1, 'out': _out
        }
        return out
print ("CNN READY")

CNN READY

a = tf.Variable(tf.random_normal([3, 3, 1, 64], stddev=0.1))
print (a)
a = tf.Print(a, [a], "a: ")
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
#sess.run(a)

<tf.Variable 'Variable_8:0' shape=(3, 3, 1, 64) dtype=float32_ref>

1 2	#print (help(tf.nn.conv2d)) print (help(tf.nn.max_pool))

Help on function max_pool in module tensorflow.python.ops.nn_ops:

max_pool(value, ksize, strides, padding, data_format='NHWC', name=None)
    Performs the max pooling on the input.

    Args:
      value: A 4-D `Tensor` with shape `[batch, height, width, channels]` and
        type `tf.float32`.
      ksize: A list of ints that has length >= 4.  The size of the window for
        each dimension of the input tensor.
      strides: A list of ints that has length >= 4.  The stride of the sliding
        window for each dimension of the input tensor.
      padding: A string, either `'VALID'` or `'SAME'`. The padding algorithm.
        See the @{tf.nn.convolution$comment here}
      data_format: A string. 'NHWC' and 'NCHW' are supported.
      name: Optional name for the operation.

    Returns:
      A `Tensor` with type `tf.float32`.  The max pooled output tensor.

None

x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_output])
keepratio = tf.placeholder(tf.float32)

# FUNCTIONS

#调用CNN函数，返回运算完的结果
_pred = conv_basic(x, weights, biases, keepratio)['out']
#交叉熵
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(_pred, y))
#Adam算法
optm = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost)
#比较
_corr = tf.equal(tf.argmax(_pred,1), tf.argmax(y,1)) 
#转换数据类型
accr = tf.reduce_mean(tf.cast(_corr, tf.float32)) 
#初始化
init = tf.global_variables_initializer()
    
# SAVER
print ("GRAPH READY")

GRAPH READY

sess = tf.Session()
sess.run(init)

#训练次数
training_epochs = 15
#batch
batch_size      = 16
#执行到第几次显示运行结果
display_step    = 1
for epoch in range(training_epochs):
    #平均误差
    avg_cost = 0.
    #total_batch = int(mnist.train.num_examples/batch_size)
    total_batch = 10
    # Loop over all batches
    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        # Fit training using batch data
        sess.run(optm, feed_dict={x: batch_xs, y: batch_ys, keepratio:0.7})
        # Compute average loss
        avg_cost += sess.run(cost, feed_dict={x: batch_xs, y: batch_ys, keepratio:1.})/total_batch

    # Display logs per epoch step
    if epoch % display_step == 0: 
        print ("Epoch: %03d/%03d cost: %.9f" % (epoch, training_epochs, avg_cost))
        train_acc = sess.run(accr, feed_dict={x: batch_xs, y: batch_ys, keepratio:1.})
        print (" Training accuracy: %.3f" % (train_acc))
        #test_acc = sess.run(accr, feed_dict={x: testimg, y: testlabel, keepratio:1.})
        #print (" Test accuracy: %.3f" % (test_acc))

print ("OPTIMIZATION FINISHED")

Epoch: 000/015 cost: 30.928401661
 Training accuracy: 0.500
Epoch: 001/015 cost: 12.954609606
 Training accuracy: 0.700
Epoch: 002/015 cost: 10.392489696
 Training accuracy: 0.700
Epoch: 003/015 cost: 7.254891634
 Training accuracy: 0.800
Epoch: 004/015 cost: 4.977767670
 Training accuracy: 0.900
Epoch: 005/015 cost: 5.414173813
 Training accuracy: 0.600
Epoch: 006/015 cost: 3.057567777
 Training accuracy: 0.700
Epoch: 007/015 cost: 4.929724103
 Training accuracy: 0.600
Epoch: 008/015 cost: 3.192437538
 Training accuracy: 0.600
Epoch: 009/015 cost: 3.224479928
 Training accuracy: 0.800
Epoch: 010/015 cost: 2.720530389
 Training accuracy: 0.400
Epoch: 011/015 cost: 3.000342276
 Training accuracy: 0.800
Epoch: 012/015 cost: 0.639763238
 Training accuracy: 1.000
Epoch: 013/015 cost: 1.897303332
 Training accuracy: 0.900
Epoch: 014/015 cost: 2.295500937
 Training accuracy: 0.800
OPTIMIZATION FINISHED

1
2