MODEL_ZOO_en.md 9.2 KB

Model Zoos and Baselines

Content

Basic Settings

Test Environment

  • Python 3.7
  • PaddlePaddle Daily version
  • CUDA 10.1
  • cuDNN 7.5
  • NCCL 2.4.8

General Settings

  • All models were trained and tested in the COCO17 dataset.
  • The codes of YOLOv5,YOLOv6,YOLOv7 and YOLOv8 can be found in PaddleYOLO. Note that the LICENSE of PaddleYOLO is GPL 3.0.
  • Unless special instructions, all the ResNet backbone network using ResNet-B structure.
  • Inference time (FPS): The reasoning time was calculated on a Tesla V100 GPU by tools/eval.py testing all validation sets in FPS (number of pictures/second). CuDNN version is 7.5, including data loading, network forward execution and post-processing, and Batch size is 1.

Training strategy

  • We adopt and Detectron in the same training strategy.
  • 1x strategy indicates that when the total batch size is 8, the initial learning rate is 0.01, and the learning rate decreases by 10 times after 8 epoch and 11 epoch, respectively, and the final training is 12 epoch.
  • 2x strategy is twice as much as strategy 1x, and the learning rate adjustment position of epochs is twice as much as strategy 1x.

ImageNet pretraining model

Paddle provides a skeleton network pretraining model based on ImageNet. All pre-training models were trained by standard Imagenet 1K dataset. ResNet and MobileNet are high-precision pre-training models obtained by cosine learning rate adjustment strategy or SSLD knowledge distillation training. Model details are available at PaddleClas.

Baseline

Object Detection

Faster R-CNN

Please refer to Faster R-CNN

YOLOv3

Please refer to YOLOv3

PP-YOLOE/PP-YOLOE+

Please refer to PP-YOLOE

PP-YOLO/PP-YOLOv2

Please refer to PP-YOLO

PicoDet

Please refer to PicoDet

RetinaNet

Please refer to RetinaNet

Cascade R-CNN

Please refer to Cascade R-CNN

SSD/SSDLite

Please refer to SSD

FCOS

Please refer to FCOS

CenterNet

Please refer to CenterNet

TTFNet/PAFNet

Please refer to TTFNet

Group Normalization

Please refer to Group Normalization

Deformable ConvNets v2

Please refer to Deformable ConvNets v2

HRNets

Please refer to HRNets

Res2Net

Please refer to Res2Net

ConvNeXt

Please refer to ConvNeXt

GFL

Please refer to GFL

TOOD

Please refer to TOOD

PSS-DET(RCNN-Enhance)

Please refer to PSS-DET

DETR

Please refer to DETR

Deformable DETR

Please refer to Deformable DETR

Sparse R-CNN

Please refer to Sparse R-CNN

Vision Transformer

Please refer to Vision Transformer

YOLOX

Please refer to YOLOX

YOLOF

Please refer to YOLOF

Instance-Segmentation

Mask R-CNN

Please refer to Mask R-CNN

Cascade R-CNN

Please refer to Cascade R-CNN

SOLOv2

Please refer to SOLOv2

PaddleYOLO

Please refer to Model Zoo for PaddleYOLO

YOLOv5

Please refer to YOLOv5

YOLOv6(v3.0)

Please refer to YOLOv6

YOLOv7

Please refer to YOLOv7

YOLOv8

Please refer to YOLOv7

RTMDet

Please refer to RTMDet

Face Detection

Please refer to Model Zoo for Face Detection

BlazeFace

Please refer to BlazeFace

Rotated Object detection

Please refer to Model Zoo for Rotated Object Detection

PP-YOLOE-R

Please refer to PP-YOLOE-R

FCOSR

Please refer to FCOSR

S2ANet

Please refer to S2ANet

KeyPoint Detection

Please refer to Model Zoo for KeyPoint Detection

PP-TinyPose

Please refer to PP-TinyPose

HRNet

Please refer to HRNet

Lite-HRNet

Please refer to Lite-HRNet

HigherHRNet

Please refer to HigherHRNet

Multi-Object Tracking

Please refer to Model Zoo for Multi-Object Tracking

DeepSORT

Please refer to DeepSORT

ByteTrack

Please refer to ByteTrack

OC-SORT

Please refer to OC-SORT

BoT-SORT

Please refer to BoT-SORT

CenterTrack

Please refer to CenterTrack

FairMOT/MC-FairMOT

Please refer to FairMOT

JDE

Please refer to JDE