[GoogLeNet]Inception-v4
参考:Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning,解析Inception-v4
架构
总体架构
各个模块输出如下:
- \(Input = 299\times 299\times 3\)
- \(Stem = 35\times 35\times 384\)
- \(4\times Inception-A = 35\times 35\times 384\)
- \(Reduction-A = 17\times 17\times 1024\)
- \(7\times Inception-B = 17\times 17\times 1024\)
- \(Reduction-B = 8\times 8\times 1536\)
- \(3\times Inception-C = 8\times 8\times 1536\)
- \(Average Pooling = 1\times 1\times 1536\)
- \(Dropout(keep 0.8) = 1536\)
- \(Softmax = 1000\)
All the convolutions not marked with “V” in the figures are same-padded meaning that their output grid matches the size of their input. Convolutions marked with “V” are valid padded, meaning that input patch of each unit is fully contained in the previous layer and the grid size of the output activation map is reduced accordingly
在下面各个模块的结构图中,没有加\(V\)符号的表示其输出数据体的空间尺寸和输入相同;加\(V\)符号的表示输出数据体空间尺寸进行了衰减
Stem
Stem
模块实现了网络早期运算,推导如下:
- \(Input = 299\times 299\times 3\)
- \(Conv\)
- \(3\times 3, S=2, N=32\)
- \(Output = 149\times 149\times 32\)
- \(Conv\)
- \(3\times 3, N=32\)
- \(Output = 147\times 147\times 32\)
- \(Conv\)
- \(3\times 3, N=64, P=1\)
- \(Output = 147\times 147\times 64\)
- \(Concat\)
- \(Max Pool\)
- \(S=2\)
- \(Output = 73\times 73\times 64\)
- \(Conv\)
- \(3\times 3, S=2, N=96\)
- \(Output=73\times 73\times 96\)
- \(Cat\)
- \(Output = 73\times 73\times 160\)
- \(Max Pool\)
- \(Concat\)
- \(One\)
- \(Conv\)
- \(1\times 1, N=64\)
- \(Output = 73\times 73\times 64\)
- \(Conv\)
- \(3\times 3, N=96\)
- \(Output = 71\times 71\times 96\)
- \(Conv\)
- \(Two\)
- \(Conv\)
- \(1\times 1, N=64\)
- \(Output = 73\times 73\times 64\)
- \(Conv\)
- \(7\times 1, N=64, P=(3, 0)\)
- \(Output = 73\times 73\times 64\)
- \(Conv\)
- \(1\times 7, N=64, P=(0, 3)\)
- \(Output = 73\times 73\times 64\)
- \(Conv\)
- \(3\times 3, N=96\)
- \(Output = 71\times 71\times 96\)
- \(Conv\)
- \(Cat\)
- \(Output = 71\times 71\times 192\)
- \(One\)
- \(Concat\)
- \(Conv\)
- \(3\times 3, S=2, N=192\)
- \(Output = 35\times 35\times 192\)
- \(Max Pool\)
- \(S=2\)
- \(Output = 35\times 35\times 192\)
- \(Cat\)
- \(Output = 35\times 35\times 384\)
- \(Conv\)
Inception-A
这是GoogLeNet_BN
使用的Inception
架构,推导如下:
- \(Input = 35\times 35\times 384\)
- \(1\times 1\)
- \(Conv\)
- \(1\times 1, N=96\)
- \(Output = 35\times 35\times 96\)
- \(Conv\)
- \(3\times 3\)
- \(Conv\)
- \(1\times 1, N=64\)
- \(Output = 35\times 35\times 64\)
- \(Conv\)
- \(3\times 3, N=96, P=1\)
- \(Output = 35\times 35\times 96\)
- \(Conv\)
- \(double 3\times 3\)
- \(Conv\)
- \(1\times 1, N=64\)
- \(Output = 35\times 35\times 64\)
- \(Conv\)
- \(3\times 3, N=96, P=1\)
- \(Output = 35\times 35\times 96\)
- \(Conv\)
- \(3\times 3, N=96, P=1\)
- \(Output = 35\times 35\times 96\)
- \(Conv\)
- \(Pool\)
- \(Avg Pool\)
- \(3\times 3, P=1\)
- \(Output = 35\times 35\times 384\)
- \(Conv\)
- \(1\times 1, N=96\)
- \(Output = 35\times 35\times 96\)
- \(Avg Pool\)
- \(Concat\)
- \(Output = 35\times 35\times 384\)
Reduction-A
模块Reduction-A
实现空间尺寸缩减功能,从\(35\times 35 \gg 17\times 17\)。其超参数\(k=192, l=224, m=256, n=384\)
- \(Input = 35\times 35\times 384\)
- \(3\times 3\)
- \(Conv\)
- \(3\times 3, S=2, N=384\)
- \(Output = 17\times 17\times 384\)
- \(Conv\)
- \(double 3\times 3\)
- \(Conv\)
- \(1\times 1, N=192\)
- \(Output = 35\times 35\times 192\)
- \(Conv\)
- \(3\times 3, N=224, P=1\)
- \(Output = 35\times 35\times 224\)
- \(Conv\)
- \(3\times 3, S=2, N=256\)
- \(17\times 17\times 256\)
- \(Conv\)
- \(Pool\)
- \(MaxPool\)
- \(S=2\)
- \(Output = 17\times 17\times 384\)
- \(MaxPool\)
- \(Concat\)
- \(Output = 17\times 17\times 1024\)
Inception-B
模块Inception-B
使用了Inception-3
模型中的分解卷积模块,其参数如下表所示
type | patch size/stride | input size | output size | depth | #1x1 | #1x7 | #7x1 | #1x7 | #7x1 |
---|---|---|---|---|---|---|---|---|---|
conv | 17x17x1024 | 17x17x384 | 1 | 384 | |||||
conv | 17x17x1024 | 17x17x256 | 3 | 192 | 224 | 256 | |||
conv | 17x17x1024 | 17x17x256 | 5 | 192 | 192 | 224 | 224 | 256 | |
avg pooling | 3x3/1 | 17x17x1024 | 17x17x128 | 2 | 128 |
Reduction-B
type | patch size/stride | input size | output size | depth | #1x1 | #1x7 | #7x1 | #3x3 |
---|---|---|---|---|---|---|---|---|
conv | stride=2 | 17x17x1024 | 8x8x192 | 2 | 192 | 192 | ||
conv | stride=2 | 17x17x1024 | 8x8x320 | 4 | 256 | 256 | 320 | 320 |
max pooling | 3x3/2 | 17x17x1024 | 8x8x1024 | 1 |
Inception-C
- \(Input = 8\times 8\times 1536\)
- \(Output = 8\times 8\times 1536\)
实现
分别定义上述6
个模块,然后定义Inception_v4
模型,完整实现参考inception_v4.py