Going Deeper with Convolutions

Christian Szegedy, Wei Liu, Yangqing Jia

2015 · CVPR

Going Deeper with Convolutions

Problem

Framing

ImageNet CNNs improved by getting deeper and wider, but naive scaling wasted parameters and compute. The paper closes this with Inception: parallel multi-scale branches plus 1×11 \times 1 reductions that keep inference near 1.51.5 billion multiply-adds. GoogLeNet reaches 6.67%6.67\% top-5 error on ILSVRC14.

Currently Used Methods

Foundational

Proposed Method

Architecture

GoogLeNet is a 22-layer CNN built by stacking Inception modules. Each module runs parallel 1×11 \times 1, 3×33 \times 3, 5×55 \times 5, and pooling branches, then concatenates channels; 1×11 \times 1 projections reduce cost before the expensive branches. The classifier head replaces large fully connected blocks with global average pooling, and training adds two auxiliary classifiers.

Verified architecture diagram: the naive Inception module and the dimension-reduced version, showing parallel 1 \times 1, 3 \times 3, 5 \times 5, and pooling branches merged by filter concatenation.

Loss / Objective

Training sums the main softmax loss and two auxiliary softmax losses.

L=Lmain+0.3Laux1+0.3Laux2\mathcal{L} = \mathcal{L}_{\mathrm{main}} + 0.3\,\mathcal{L}_{\mathrm{aux1}} + 0.3\,\mathcal{L}_{\mathrm{aux2}}

Algorithm

An Inception block applies four branch transforms to x\mathbf{x} and concatenates their outputs.

y=concat(conv1×1(x),conv3×3(conv1×1(x)),conv5×5(conv1×1(x)),conv1×1(pool3×3(x)))\mathbf{y} = \operatorname{concat}\Big( \operatorname{conv}_{1 \times 1}(\mathbf{x}), \operatorname{conv}_{3 \times 3}(\operatorname{conv}_{1 \times 1}(\mathbf{x})), \operatorname{conv}_{5 \times 5}(\operatorname{conv}_{1 \times 1}(\mathbf{x})), \operatorname{conv}_{1 \times 1}(\operatorname{pool}_{3 \times 3}(\mathbf{x})) \Big)

Training Procedure

Evaluation

Datasets

Metrics

Headline results

Table 1: ILSVRC classification challenge leaderboard by top-5 error

TeamYearPlaceError (top-5)Uses external data
SuperVision20121st16.4%no
SuperVision20121st15.3%Imagenet 22k
Clarifai20131st11.7%no
Clarifai20131st11.2%Imagenet 22k
MSRA20143rd7.35%no
VGG20142nd7.32%no
GoogLeNet20141st6.67%no

Ablations

Method Strengths and Weaknesses

Strengths

Weaknesses

Suggestions from the authors

Links

Prior Papers

Further Papers