kangtan b3d18d7e6e 数据集中添加真实数据 | il y a 1 an | |
---|---|---|
PPOCRLabel | il y a 1 an | |
StyleText | il y a 1 an | |
applications | il y a 1 an | |
benchmark | il y a 1 an | |
configs | il y a 1 an | |
deploy | il y a 1 an | |
doc | il y a 1 an | |
ppocr | il y a 1 an | |
ppstructure | il y a 1 an | |
test_tipc | il y a 1 an | |
tools | il y a 1 an | |
.gitignore | il y a 1 an | |
LICENSE | il y a 1 an | |
MANIFEST.in | il y a 1 an | |
README.md | il y a 1 an | |
README_ch.md | il y a 1 an | |
__init__.py | il y a 1 an | |
paddleocr.py | il y a 1 an | |
requirements.txt | il y a 1 an | |
setup.py | il y a 1 an | |
train.sh | il y a 1 an |
English | 简体中文 | हिन्दी | 日本語 | 한국인 | Pу́сский язы́к
PaddleOCR aims to create multilingual, awesome, leading, and practical OCR tools that help users train better models and apply them into practice.
💥 Live Playback: Introduction to PP-StructureV2 optimization strategy. Scan the QR code below using WeChat, follow the PaddlePaddle official account and fill out the questionnaire to join the WeChat group, get the live link and 20G OCR learning materials (including PDF2Word application, 10 models in vertical scenarios, etc.)
🔥2022.8.24 Release PaddleOCR release/2.6
🔥2022.8 Release OCR scene application collection
2022.8 Add implementation of 8 cutting-edge algorithms
2022.5.9 Release PaddleOCR release/2.5
PaddleOCR support a variety of cutting-edge algorithms related to OCR, and developed industrial featured models/solution PP-OCR and PP-Structure on this basis, and get through the whole process of data production, model training, compression, inference and deployment.
It is recommended to start with the “quick experience” in the document tutorial
For international developers, we regard PaddleOCR Discussions as our international community platform. All ideas and questions can be discussed here in English.
For Chinese develops, Scan the QR code below with your Wechat, you can join the official technical discussion group. For richer community content, please refer to 中文README, looking forward to your participation.
Model introduction | Model name | Recommended scene | Detection model | Direction classifier | Recognition model |
---|---|---|---|---|---|
Chinese and English ultra-lightweight PP-OCRv3 model(16.2M) | ch_PP-OCRv3_xx | Mobile & Server | inference model / trained model | inference model / trained model | inference model / trained model |
English ultra-lightweight PP-OCRv3 model(13.4M) | en_PP-OCRv3_xx | Mobile & Server | inference model / trained model | inference model / trained model | inference model / trained model |
Chinese and English ultra-lightweight PP-OCRv2 model(11.6M) | ch_PP-OCRv2_xx | Mobile & Server | inference model / trained model | inference model / trained model | inference model / trained model |
Chinese and English ultra-lightweight PP-OCR model (9.4M) | ch_ppocr_mobile_v2.0_xx | Mobile & server | inference model / trained model | inference model / trained model | inference model / trained model |
Chinese and English general PP-OCR model (143.4M) | ch_ppocr_server_v2.0_xx | Server | inference model / trained model | inference model / trained model | inference model / trained model |
PP-OCRv3 Chinese model
<img src="doc/imgs_results/PP-OCRv3/ch/PP-OCRv3-pic001.jpg" width="800">
<img src="doc/imgs_results/PP-OCRv3/ch/PP-OCRv3-pic002.jpg" width="800">
<img src="doc/imgs_results/PP-OCRv3/ch/PP-OCRv3-pic003.jpg" width="800">
PP-OCRv3 English model
<img src="doc/imgs_results/PP-OCRv3/en/en_1.png" width="800">
<img src="doc/imgs_results/PP-OCRv3/en/en_2.png" width="800">
PP-OCRv3 Multilingual model
<img src="doc/imgs_results/PP-OCRv3/multi_lang/japan_2.jpg" width="800">
<img src="doc/imgs_results/PP-OCRv3/multi_lang/korean_1.jpg" width="800">
PP-StructureV2
layout analysis + table recognition
SER (Semantic entity recognition)
If you want to request a new language support, a PR with 1 following files are needed:
{language}_dict.txt
that contains a list of all characters. Please see the format example from other files in that folder.If your language has unique elements, please tell me in advance within any way, such as useful links, wikipedia and so on.
More details, please refer to Multilingual OCR Development Plan.
This project is released under Apache 2.0 license