yangjun dfa27afb39 提交PaddleDetection develop 分支 d56cf3f7c294a7138013dac21f87da4ea6bee829 | 1 tahun lalu | |
---|---|---|
.. | ||
_base_ | 1 tahun lalu | |
README.md | 1 tahun lalu | |
README_cn.md | 1 tahun lalu | |
ppyolo_mbv3_large_coco.yml | 1 tahun lalu | |
ppyolo_mbv3_small_coco.yml | 1 tahun lalu | |
ppyolo_r18vd_coco.yml | 1 tahun lalu | |
ppyolo_r50vd_dcn_1x_coco.yml | 1 tahun lalu | |
ppyolo_r50vd_dcn_1x_minicoco.yml | 1 tahun lalu | |
ppyolo_r50vd_dcn_2x_coco.yml | 1 tahun lalu | |
ppyolo_r50vd_dcn_voc.yml | 1 tahun lalu | |
ppyolo_test.yml | 1 tahun lalu | |
ppyolo_tiny_650e_coco.yml | 1 tahun lalu | |
ppyolov2_r101vd_dcn_365e_coco.yml | 1 tahun lalu | |
ppyolov2_r50vd_dcn_365e_coco.yml | 1 tahun lalu | |
ppyolov2_r50vd_dcn_voc.yml | 1 tahun lalu |
English | 简体中文
PP-YOLO is a optimized model based on YOLOv3 in PaddleDetection,whose performance(mAP on COCO) and inference spped are better than YOLOv4,PaddlePaddle 2.0.2(available on pip now) or Daily Version is required to run this PP-YOLO。
PP-YOLO reached mmAP(IoU=0.5:0.95) as 45.9% on COCO test-dev2017 dataset, and inference speed of FP32 on single V100 is 72.9 FPS, inference speed of FP16 with TensorRT on single V100 is 155.6 FPS.
PP-YOLO and PP-YOLOv2 improved performance and speed of YOLOv3 with following methods:
Model | GPU number | images/GPU | backbone | input shape | Box APval | Box APtest | V100 FP32(FPS) | V100 TensorRT FP16(FPS) | download | config |
---|---|---|---|---|---|---|---|---|---|---|
PP-YOLO | 8 | 24 | ResNet50vd | 608 | 44.8 | 45.2 | 72.9 | 155.6 | model | config |
PP-YOLO | 8 | 24 | ResNet50vd | 512 | 43.9 | 44.4 | 89.9 | 188.4 | model | config |
PP-YOLO | 8 | 24 | ResNet50vd | 416 | 42.1 | 42.5 | 109.1 | 215.4 | model | config |
PP-YOLO | 8 | 24 | ResNet50vd | 320 | 38.9 | 39.3 | 132.2 | 242.2 | model | config |
PP-YOLO_2x | 8 | 24 | ResNet50vd | 608 | 45.3 | 45.9 | 72.9 | 155.6 | model | config |
PP-YOLO_2x | 8 | 24 | ResNet50vd | 512 | 44.4 | 45.0 | 89.9 | 188.4 | model | config |
PP-YOLO_2x | 8 | 24 | ResNet50vd | 416 | 42.7 | 43.2 | 109.1 | 215.4 | model | config |
PP-YOLO_2x | 8 | 24 | ResNet50vd | 320 | 39.5 | 40.1 | 132.2 | 242.2 | model | config |
PP-YOLO | 4 | 32 | ResNet18vd | 512 | 29.2 | 29.5 | 357.1 | 657.9 | model | config |
PP-YOLO | 4 | 32 | ResNet18vd | 416 | 28.6 | 28.9 | 409.8 | 719.4 | model | config |
PP-YOLO | 4 | 32 | ResNet18vd | 320 | 26.2 | 26.4 | 480.7 | 763.4 | model | config |
PP-YOLOv2 | 8 | 12 | ResNet50vd | 640 | 49.1 | 49.5 | 68.9 | 106.5 | model | config |
PP-YOLOv2 | 8 | 12 | ResNet101vd | 640 | 49.7 | 50.3 | 49.5 | 87.0 | model | config |
Notes:
mAP(IoU=0.5:0.95)
.tools/export_model.py
and benchmarked by running depoly/python/infer.py
with --run_benchmark
. All testing results do not contains the time cost of data reading and post-processing(NMS), which is same as YOLOv4(AlexyAB) in testing method.yolo_box
) part comparing with FP32 testing above, which means that data reading, bounding-box decoding and post-processing(NMS) is excluded(test method same as YOLOv4(AlexyAB) too)--run_benchmark=True
,you should install these dependencies at first, pip install pynvml psutil GPUtil
.Model | GPU number | images/GPU | Model Size | input shape | Box APval | Box AP50val | Kirin 990 1xCore(FPS) | download | config |
---|---|---|---|---|---|---|---|---|---|
PP-YOLO_MobileNetV3_large | 4 | 32 | 28MB | 320 | 23.2 | 42.6 | 14.1 | model | config |
PP-YOLO_MobileNetV3_small | 4 | 32 | 16MB | 320 | 17.2 | 33.8 | 21.5 | model | config |
Notes:
mAP(IoU=0.5:0.95)
, Box APval is evaluation results of mAP(IoU=0.5)
.Model | GPU number | images/GPU | Model Size | Post Quant Model Size | input shape | Box APval | Kirin 990 4xCore(FPS) | download | config | post quant model |
---|---|---|---|---|---|---|---|---|---|---|
PP-YOLO tiny | 8 | 32 | 4.2MB | 1.3M | 320 | 20.6 | 92.3 | model | config | inference model |
PP-YOLO tiny | 8 | 32 | 4.2MB | 1.3M | 416 | 22.7 | 65.4 | model | config | inference model |
Notes:
mAP(IoU=0.5:0.95)
, Box APval is evaluation results of mAP(IoU=0.5)
.PP-YOLO trained on Pascal VOC dataset as follows:
Model | GPU number | images/GPU | backbone | input shape | Box AP50val | download | config |
---|---|---|---|---|---|---|---|
PP-YOLO | 8 | 12 | ResNet50vd | 608 | 84.9 | model | config |
PP-YOLO | 8 | 12 | ResNet50vd | 416 | 84.3 | model | config |
PP-YOLO | 8 | 12 | ResNet50vd | 320 | 82.2 | model | config |
Training PP-YOLO on 8 GPUs with following command(all commands should be run under PaddleDetection dygraph directory as default)
python -m paddle.distributed.launch --log_dir=./ppyolo_dygraph/ --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml &>ppyolo_dygraph.log 2>&1 &
optional: Run tools/anchor_cluster.py
to get anchors suitable for your dataset, and modify the anchor setting in model configuration file and reader configuration file, such as configs/ppyolo/_base_/ppyolo_tiny.yml
and configs/ppyolo/_base_/ppyolo_tiny_reader.yml
.
python tools/anchor_cluster.py -c configs/ppyolo/ppyolo_tiny_650e_coco.yml -n 9 -s 320 -m v2 -i 1000
Evaluating PP-YOLO on COCO val2017 dataset in single GPU with following commands:
# use weights released in PaddleDetection model zoo
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
# use saved checkpoint in training
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=output/ppyolo_r50vd_dcn_1x_coco/model_final
For evaluation on COCO test-dev2017 dataset, configs/ppyolo/ppyolo_test.yml
should be used, please download COCO test-dev2017 dataset from COCO dataset download and decompress to pathes configured by EvalReader.dataset
in configs/ppyolo/ppyolo_test.yml
and run evaluation by following command:
# use weights released in PaddleDetection model zoo
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_test.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
# use saved checkpoint in training
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_test.yml -o weights=output/ppyolo_r50vd_dcn_1x_coco/model_final
Evaluation results will be saved in bbox.json
, compress it into a zip
package and upload to COCO dataset evaluation to evaluate.
NOTE 1: configs/ppyolo/ppyolo_test.yml
is only used for evaluation on COCO test-dev2017 dataset, could not be used for training or COCO val2017 dataset evaluating.
NOTE 2: Due to the overall upgrade of the dynamic graph framework, the following weight models published by paddledetection need to be evaluated by adding the -- bias field, such as
# use weights released in PaddleDetection model zoo
CUDA_VISIBLE_DEVICES=0 python tools/eval.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --bias
These models are:
1.ppyolo_r50vd_dcn_1x_coco
2.ppyolo_r50vd_dcn_voc
3.ppyolo_r18vd_coco
4.ppyolo_mbv3_large_coco
5.ppyolo_mbv3_small_coco
6.ppyolo_tiny_650e_coco
Inference images in single GPU with following commands, use --infer_img
to inference a single image and --infer_dir
to inference all images in the directory.
# inference single image
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --infer_img=demo/000000014439_640x640.jpg
# inference all images in the directory
CUDA_VISIBLE_DEVICES=0 python tools/infer.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams --infer_dir=demo
For inference deployment or benchmard, model exported with tools/export_model.py
should be used and perform inference with Paddle inference library with following commands:
# export model, model will be save in output/ppyolo as default
python tools/export_model.py -c configs/ppyolo/ppyolo_r50vd_dcn_1x_coco.yml -o weights=https://paddledet.bj.bcebos.com/models/ppyolo_r50vd_dcn_1x_coco.pdparams
# inference with Paddle Inference library
CUDA_VISIBLE_DEVICES=0 python deploy/python/infer.py --model_dir=output_inference/ppyolo_r50vd_dcn_1x_coco --image_file=demo/000000014439_640x640.jpg --device=GPU
Optimizing method and ablation experiments of PP-YOLO compared with YOLOv3.
NO. | Model | Box APval | Box APtest | Params(M) | FLOPs(G) | V100 FP32 FPS |
---|---|---|---|---|---|---|
A | YOLOv3-DarkNet53 | 38.9 | - | 59.13 | 65.52 | 58.2 |
B | YOLOv3-ResNet50vd-DCN | 39.1 | - | 43.89 | 44.71 | 79.2 |
C | B + LB + EMA + DropBlock | 41.4 | - | 43.89 | 44.71 | 79.2 |
D | C + IoU Loss | 41.9 | - | 43.89 | 44.71 | 79.2 |
E | D + IoU Aware | 42.5 | - | 43.90 | 44.71 | 74.9 |
F | E + Grid Sensitive | 42.8 | - | 43.90 | 44.71 | 74.8 |
G | F + Matrix NMS | 43.5 | - | 43.90 | 44.71 | 74.8 |
H | G + CoordConv | 44.0 | - | 43.93 | 44.76 | 74.1 |
I | H + SPP | 44.3 | 45.2 | 44.93 | 45.12 | 72.9 |
J | I + Better ImageNet Pretrain | 44.8 | 45.2 | 44.93 | 45.12 | 72.9 |
K | J + 2x Scheduler | 45.3 | 45.9 | 44.93 | 45.12 | 72.9 |
Notes:
Box AP
is evaluation results as mAP(IoU=0.5:0.95)
.@article{huang2021pp,
title={PP-YOLOv2: A Practical Object Detector},
author={Huang, Xin and Wang, Xinxin and Lv, Wenyu and Bai, Xiaying and Long, Xiang and Deng, Kaipeng and Dang, Qingqing and Han, Shumin and Liu, Qiwen and Hu, Xiaoguang and others},
journal={arXiv preprint arXiv:2104.10419},
year={2021}
}
@misc{long2020ppyolo,
title={PP-YOLO: An Effective and Efficient Implementation of Object Detector},
author={Xiang Long and Kaipeng Deng and Guanzhong Wang and Yang Zhang and Qingqing Dang and Yuan Gao and Hui Shen and Jianguo Ren and Shumin Han and Errui Ding and Shilei Wen},
year={2020},
eprint={2007.12099},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
@misc{ppdet2019,
title={PaddleDetection, Object detection and instance segmentation toolkit based on PaddlePaddle.},
author={PaddlePaddle Authors},
howpublished = {\url{https://github.com/PaddlePaddle/PaddleDetection}},
year={2019}
}