Paper:
Fourier Contour Embedding for Arbitrary-Shaped Text Detection Yiqin Zhu and Jianyong Chen and Lingyu Liang and Zhanghui Kuang and Lianwen Jin and Wayne Zhang CVPR, 2021
On the CTW1500 dataset, the text detection result is as follows:
Model | Backbone | Configuration | Precision | Recall | Hmean | Download |
---|---|---|---|---|---|---|
FCE | ResNet50_dcn | configs/det/det_r50_vd_dcn_fce_ctw.yml | 88.39% | 82.18% | 85.27% | trained model |
Please prepare your environment referring to prepare the environment and clone the repo.
The above FCE model is trained using the CTW1500 text detection public dataset. For the download of the dataset, please refer to ocr_datasets.
After the data download is complete, please refer to Text Detection Training Tutorial for training. PaddleOCR has modularized the code structure, so that you only need to replace the configuration file to train different detection models.
First, convert the model saved in the FCE text detection training process into an inference model. Taking the model based on the Resnet50_vd_dcn backbone network and trained on the CTW1500 English dataset as example (model download link), you can use the following command to convert:
python3 tools/export_model.py -c configs/det/det_r50_vd_dcn_fce_ctw.yml -o Global.pretrained_model=./det_r50_dcn_fce_ctw_v2.0_train/best_accuracy Global.save_inference_dir=./inference/det_fce
FCE text detection model inference, to perform non-curved text detection, you can run the following commands:
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img_10.jpg" --det_model_dir="./inference/det_fce/" --det_algorithm="FCE" --det_fce_box_type=quad
The visualized text detection results are saved to the ./inference_results
folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
If you want to perform curved text detection, you can execute the following command:
python3 tools/infer/predict_det.py --image_dir="./doc/imgs_en/img623.jpg" --det_model_dir="./inference/det_fce/" --det_algorithm="FCE" --det_fce_box_type=poly
The visualized text detection results are saved to the ./inference_results
folder by default, and the name of the result file is prefixed with 'det_res'. Examples of results are as follows:
Note: Since the CTW1500 dataset has only 1,000 training images, mainly for English scenes, the above model has very poor detection result on Chinese or curved text images.
Since the post-processing is not written in CPP, the FCE text detection model does not support CPP inference.
Not supported
Not supported
@InProceedings{zhu2021fourier,
title={Fourier Contour Embedding for Arbitrary-Shaped Text Detection},
author={Yiqin Zhu and Jianyong Chen and Lingyu Liang and Zhanghui Kuang and Lianwen Jin and Wayne Zhang},
year={2021},
booktitle = {CVPR}
}