EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Mingxing Tan, Quoc V. Le

2019 · ICML

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks

Problem

Framing

CNNs had been scaled one dimension at a time, which wastes compute and yields early saturation. EfficientNet closes this gap with compound scaling: jointly scale depth, width, and input resolution from one searched baseline. On ImageNet, EfficientNet-B7 reaches 84.3% top-1 with 66M parameters.

Currently Used Methods

Foundational

Proposed Method

Architecture

EfficientNet starts from EfficientNet-B0, a NAS-designed MBConv network with squeeze-and-excitation and Swish. It keeps stage structure fixed and scales all stages with one global coefficient ϕ\phi.

Verified diagram of baseline scaling choices: width-only, depth-only, resolution-only, and the proposed compound scaling that increases all three together.

Loss / Objective

The paper optimizes standard supervised classification loss while changing only the scaling rule.

L=k=1Kyklogpk\mathcal{L} = - \sum_{k=1}^{K} y_k \log p_k

Scaling Rule / Algorithm

Compound scaling assigns extra compute uniformly across depth, width, and resolution.

d=αϕ,w=βϕ,r=γϕd = \alpha^{\phi}, \qquad w = \beta^{\phi}, \qquad r = \gamma^{\phi} αβ2γ22,α1, β1, γ1\alpha \cdot \beta^2 \cdot \gamma^2 \approx 2, \qquad \alpha \ge 1, \ \beta \ge 1, \ \gamma \ge 1

Training Procedure

Evaluation

Datasets

Metrics

Headline results

Table 1: ImageNet comparisons at matched accuracy levels show EfficientNet using far fewer parameters and FLOPs.

ModelTop-1 Acc.Top-5 Acc.#ParamsRatio-to-EfficientNet#FLOPsRatio-to-EfficientNet
EfficientNet-B077.1%93.3%5.3M1x0.39B1x
ResNet-5076.0%93.0%26M4.9x4.1B11x
DenseNet-16976.2%93.2%14M2.6x3.5B8.9x
EfficientNet-B179.1%94.4%7.8M1x0.70B1x
ResNet-15277.8%93.8%60M7.6x11B16x
DenseNet-26477.9%93.9%34M4.3x6.0B8.6x
Inception-v378.8%94.4%24M3.0x5.7B8.1x
Xception79.0%94.5%23M3.0x8.4B12x
EfficientNet-B280.1%94.9%9.2M1x1.0B1x
Inception-v480.0%95.0%48M5.2x13B13x
Inception-resnet-v280.1%95.1%56M6.1x13B13x
EfficientNet-B381.6%95.7%12M1x1.8B1x
ResNeXt-10180.9%95.6%84M7.0x32B18x
PolyNet81.3%95.8%92M7.7x35B19x
EfficientNet-B482.9%96.4%19M1x4.2B1x
SENet82.7%96.2%146M7.7x42B10x
NASNet-A82.7%96.2%89M4.7x24B5.7x
AmoebaNet-A82.8%96.1%87M4.6x23B5.5x
PNASNet82.9%96.2%86M4.5x23B6.0x
EfficientNet-B583.6%96.7%30M1x9.9B1x
AmoebaNet-C83.5%96.5%155M5.2x41B4.1x
EfficientNet-B684.0%96.8%43M1x19B1x
EfficientNet-B784.3%97.0%66M1x37B1x
GPipe84.3%97.0%557M8.4x--

Ablations

Method Strengths and Weaknesses

Strengths

Weaknesses

Suggestions from the authors

Links

Prior Papers

Further Papers

No vault papers identified as further work yet.