YOLO9000: Better, Faster, Stronger

原文地址:YOLO9000: Better, Faster, Stronger

摘要

We introduce YOLO9000, a state-of-the-art, real-time object detection system that can detect over 9000 object categories. First we propose various improvements to the YOLO detection method, both novel and drawn from prior work. The improved model, YOLOv2, is state-of-the-art on standard detection tasks like PASCAL VOC and COCO. Using a novel, multi-scale training method the same YOLOv2 model can run at varying sizes, offering an easy tradeoff between speed and accuracy. At 67 FPS, YOLOv2 gets 76.8 mAP on VOC 2007. At 40 FPS, YOLOv2 gets 78.6 mAP, outperforming state-of-the-art methods like Faster R-CNN with ResNet and SSD while still running significantly faster. Finally we propose a method to jointly train on object detection and classification. Using this method we train YOLO9000 simultaneously on the COCO detection dataset and the ImageNet classification dataset. Our joint training allows YOLO9000 to predict detections for object classes that don’t have labelled detection data. We validate our approach on the ImageNet detection task. YOLO9000 gets 19.7 mAP on the ImageNet detection validation set despite only having detection data for 44 of the 200 classes. On the 156 classes not in COCO, YOLO9000 gets 16.0 mAP. But YOLO can detect more than just 200 classes; it predicts detections for more than 9000 different object categories. And it still runs in real-time.

我们介绍了一个最先进的实时目标检测系统YOLO9000,它可以检测超过9000个目标类别。首先,我们对YOLO检测方法提出了各种改进,有新的方法也有借鉴前人的工作。改进后的模型YOLOv2是最先进的目标检测框架。使用一种新颖的多尺度训练方法,同一个YOLOv2模型可以在不同的大小下运行,在速度和精度之间提供了一个简单的折衷。在67帧/秒的速度下,YOLOv2在VOC 2007上获得76.8 mAP。在40fps时,YOLOv2获得78.6 mAP,优于最先进的方法,如使用ResNet的Faster R-CNN和SSD,同时运行速度明显更快。最后提出了一种目标检测与分类联合训练的方法。利用该方法,我们在COCO检测数据集和ImageNet分类数据集上同时训练YOLO9000。我们的联合训练允许YOLO9000预测没有标记检测数据的目标类的检测。我们在ImageNet检测任务中验证了我们的方法。YOLO9000在ImageNet检测验证集上获得19.7 mAP,尽管在200个类中只有44个有检测数据。另外156个不属于COCO的类别中,YOLO9000得到16.0 mAP。在检测200多个类,预测超过9000多个不同目标类别的情况下,YOLO仍然能够实时运行

YOLOv2/YOLO9000

  • YOLOv2:目标检测算法,基于YOLOv1的改进
  • YOLO9000:使用YOLOv2算法,结合数据集组合方法和级联训练算法得到的实时检测框架,能够检测超过9000个类别

章节安排

论文结构非常清晰,主要包含3个章节:

  1. Better:如何提高mAP
  2. Faster:如果提高FPS
  3. Stronger:提出一个级联训练分类和检测数据的机制,结合YOLOv2训练得到YOLO9000

Better