Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun

2015 · NeurIPS

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Problem

Framing

Region-based detectors were accurate but proposal generation still cost seconds per image. Faster R-CNN replaces hand-engineered proposals with an RPN that shares convolutional features with Fast R-CNN, reducing proposal time to about 10ms10\,\mathrm{ms} and reaching 73.2%73.2\% mAP on VOC 2007 with VGG-16.

Currently Used Methods

Foundational

Proposed Method

Architecture

The model places an RPN on the shared convolutional map and reuses those features for Fast R-CNN detection. A 3×33 \times 3 sliding window feeds sibling 1×11 \times 1 heads for objectness and box regression over k=9k=9 anchors from 33 scales and 33 aspect ratios.

Verified figure: three multi-scale schemes, contrasting image pyramids, filter pyramids, and Faster R-CNN's anchor-box reference pyramid.

Loss / Objective

The RPN optimizes joint anchor classification and box regression.

L({pi},{ti})=1NclsiLcls(pi,pi)+λ1NregipiLreg(ti,ti)L(\{p_i\}, \{t_i\}) = \frac{1}{N_{\mathrm{cls}}} \sum_i L_{\mathrm{cls}}(p_i, p_i^*) + \lambda \frac{1}{N_{\mathrm{reg}}} \sum_i p_i^* L_{\mathrm{reg}}(t_i, t_i^*) Lcls(pi,pi)=pilogpi(1pi)log(1pi)L_{\mathrm{cls}}(p_i, p_i^*) = - p_i^* \log p_i - (1-p_i^*) \log (1-p_i) Lreg(ti,ti)=R(titi)L_{\mathrm{reg}}(t_i, t_i^*) = R(t_i - t_i^*)

Sampling Rule / Algorithm

Anchor labels and proposal pruning are set by IoU thresholds and NMS.

pi=1if anchor i has highest IoU with a ground-truth box or IoU >0.7p_i^* = 1 \quad \text{if anchor } i \text{ has highest IoU with a ground-truth box or IoU } > 0.7 pi=0if anchor i has IoU <0.3 for all ground-truth boxesp_i^* = 0 \quad \text{if anchor } i \text{ has IoU } < 0.3 \text{ for all ground-truth boxes}

Training Procedure

Evaluation

Datasets

Metrics

Headline results

Table 1: Proposal methods under Fast R-CNN on VOC 2007 test set

train-time region proposals method# boxestest-time region proposals method# proposalsmAP (%)
SS2000SS200058.7
EB2000EB200058.6
RPN+ZF, shared2000RPN+ZF, shared30059.9

Ablations

Method Strengths and Weaknesses

Strengths

Weaknesses

Suggestions from the authors

Links

Prior Papers

Further Papers