s-emanuilov commited on
Commit
2642e18
Β·
verified Β·
1 Parent(s): 3bc9998

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +387 -0
README.md ADDED
@@ -0,0 +1,387 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # PP-OCRv5 ONNX Models
2
+
3
+ Fast and accurate multilingual OCR models from PaddleOCR, converted to ONNX format for easy deployment.
4
+
5
+ **Original Models**: [PaddlePaddle PP-OCRv5 Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv5-684a5356aef5b4b1d7b85e4b)
6
+ **Converted by**: Community contribution
7
+ **Format**: ONNX (optimized for inference)
8
+ **License**: Apache 2.0
9
+
10
+ ---
11
+
12
+ ## 🎯 What's Inside
13
+
14
+ This repository contains **11 production-ready ONNX models**:
15
+
16
+ - **1 Detection Model** - Finds text in images (works with all languages)
17
+ - **7 Recognition Models** - Reads text in 39+ languages
18
+ - **3 Preprocessing Models** - Fixes rotated or distorted documents (optional)
19
+
20
+ **Total Size**: ~258 MB
21
+ **Languages**: English, French, German, Spanish, Italian, Portuguese, Russian, Ukrainian, Korean, Chinese, Japanese, Thai, Greek, and 25+ more!
22
+
23
+ ---
24
+
25
+ ## πŸš€ Quick Start
26
+
27
+ ### Installation
28
+
29
+ ```bash
30
+ pip install rapidocr-onnxruntime
31
+ ```
32
+
33
+ That's it! No PaddlePaddle, no CUDA required. Works on CPU out of the box.
34
+
35
+ ### Basic Usage - English
36
+
37
+ ```python
38
+ from rapidocr_onnxruntime import RapidOCR
39
+
40
+ # Initialize OCR
41
+ ocr = RapidOCR(
42
+ det_model_path="detection/PP-OCRv5_server_det.onnx",
43
+ rec_model_path="english/en_PP-OCRv5_mobile_rec.onnx",
44
+ rec_keys_path="english/ppocrv5_en_dict.txt"
45
+ )
46
+
47
+ # Run OCR
48
+ result, elapsed = ocr("your_image.jpg")
49
+
50
+ # Print results
51
+ for line in result:
52
+ text = line[1][0] # Extracted text
53
+ confidence = line[1][1] # Confidence score
54
+ print(f"{text} (confidence: {confidence:.2%})")
55
+ ```
56
+
57
+ ### Other Languages
58
+
59
+ Just change the model paths:
60
+
61
+ ```python
62
+ # French, German, Spanish, Italian, etc. (32 languages)
63
+ ocr = RapidOCR(
64
+ det_model_path="detection/PP-OCRv5_server_det.onnx",
65
+ rec_model_path="latin/latin_PP-OCRv5_mobile_rec.onnx",
66
+ rec_keys_path="latin/ppocrv5_latin_dict.txt"
67
+ )
68
+
69
+ # Russian, Bulgarian, Ukrainian, Belarusian
70
+ ocr = RapidOCR(
71
+ det_model_path="detection/PP-OCRv5_server_det.onnx",
72
+ rec_model_path="eslav/eslav_PP-OCRv5_mobile_rec.onnx",
73
+ rec_keys_path="eslav/ppocrv5_eslav_dict.txt"
74
+ )
75
+
76
+ # Korean
77
+ ocr = RapidOCR(
78
+ det_model_path="detection/PP-OCRv5_server_det.onnx",
79
+ rec_model_path="korean/korean_PP-OCRv5_mobile_rec.onnx",
80
+ rec_keys_path="korean/ppocrv5_korean_dict.txt"
81
+ )
82
+
83
+ # Chinese / Japanese
84
+ ocr = RapidOCR(
85
+ det_model_path="detection/PP-OCRv5_server_det.onnx",
86
+ rec_model_path="chinese/PP-OCRv5_server_rec.onnx",
87
+ rec_keys_path="chinese/ppocrv5_dict.txt"
88
+ )
89
+
90
+ # Thai
91
+ ocr = RapidOCR(
92
+ det_model_path="detection/PP-OCRv5_server_det.onnx",
93
+ rec_model_path="thai/th_PP-OCRv5_mobile_rec.onnx",
94
+ rec_keys_path="thai/ppocrv5_th_dict.txt"
95
+ )
96
+
97
+ # Greek
98
+ ocr = RapidOCR(
99
+ det_model_path="detection/PP-OCRv5_server_det.onnx",
100
+ rec_model_path="greek/el_PP-OCRv5_mobile_rec.onnx",
101
+ rec_keys_path="greek/ppocrv5_el_dict.txt"
102
+ )
103
+ ```
104
+
105
+ ---
106
+
107
+ ## πŸ“¦ Available Models
108
+
109
+ ### Text Recognition Models
110
+
111
+ | Model | Languages | Accuracy | Size | Best For |
112
+ |-------|-----------|----------|------|----------|
113
+ | **english/** | English | 85.25% | 7.5 MB | English documents |
114
+ | **latin/** | French, German, Spanish, Italian, Portuguese, Dutch, Polish, Czech, + 24 more | 84.7% | 7.5 MB | European documents |
115
+ | **eslav/** | Russian, Bulgarian, Ukrainian, Belarusian, English | 81.6% | 7.5 MB | Cyrillic scripts |
116
+ | **korean/** | Korean, English | 88.0% | 13 MB | Korean documents |
117
+ | **chinese/** | Chinese, Japanese, English | - | 81 MB | CJK documents |
118
+ | **thai/** | Thai, English | 82.68% | 7.5 MB | Thai documents |
119
+ | **greek/** | Greek, English | 89.28% | 7.4 MB | Greek documents |
120
+
121
+ ### Detection Model
122
+
123
+ - **detection/** - Universal text detection (84 MB) - Works with all languages
124
+
125
+ ### Preprocessing Models (Optional)
126
+
127
+ Enhance OCR accuracy on challenging documents:
128
+
129
+ - **preprocessing/doc-orientation/** - Fixes rotated documents (6.5 MB, 99.06% accuracy)
130
+ - **preprocessing/textline-orientation/** - Fixes upside-down text (6.5 MB, 98.85% accuracy)
131
+ - **preprocessing/doc-unwarping/** - Fixes curved/warped pages (30 MB)
132
+
133
+ ---
134
+
135
+ ## 🌍 Supported Languages (39+)
136
+
137
+ ### Latin Model (32 languages)
138
+ English β€’ French β€’ German β€’ Spanish β€’ Italian β€’ Portuguese β€’ Dutch β€’ Polish β€’ Czech β€’ Slovak β€’ Croatian β€’ Bosnian β€’ Serbian (Latin) β€’ Slovenian β€’ Danish β€’ Norwegian β€’ Swedish β€’ Icelandic β€’ Estonian β€’ Lithuanian β€’ Hungarian β€’ Albanian β€’ Welsh β€’ Irish β€’ Turkish β€’ Indonesian β€’ Malay β€’ Afrikaans β€’ Swahili β€’ Tagalog β€’ Uzbek β€’ Latin
139
+
140
+ ### Other Models
141
+ - **English** - English (optimized)
142
+ - **East Slavic** - Russian β€’ Bulgarian β€’ Ukrainian β€’ Belarusian
143
+ - **Korean** - Korean
144
+ - **Chinese/Japanese** - Simplified Chinese β€’ Traditional Chinese β€’ Pinyin β€’ Japanese (Hiragana, Katakana, Kanji)
145
+ - **Thai** - Thai
146
+ - **Greek** - Greek
147
+
148
+ ---
149
+
150
+ ## πŸ“ Repository Structure
151
+
152
+ ```
153
+ .
154
+ β”œβ”€β”€ detection/ # Text detection (84 MB)
155
+ β”‚ β”œβ”€β”€ PP-OCRv5_server_det.onnx
156
+ β”‚ └── config.json
157
+ β”‚
158
+ β”œβ”€β”€ english/ # English (7.5 MB)
159
+ β”‚ β”œβ”€β”€ en_PP-OCRv5_mobile_rec.onnx
160
+ β”‚ β”œβ”€β”€ ppocrv5_en_dict.txt
161
+ β”‚ └── config.json
162
+ β”‚
163
+ β”œβ”€β”€ latin/ # 32 languages (7.5 MB)
164
+ β”‚ β”œβ”€β”€ latin_PP-OCRv5_mobile_rec.onnx
165
+ β”‚ β”œβ”€β”€ ppocrv5_latin_dict.txt
166
+ β”‚ └── config.json
167
+ β”‚
168
+ β”œβ”€β”€ eslav/ # Russian/Ukrainian (7.5 MB)
169
+ β”‚ β”œβ”€β”€ eslav_PP-OCRv5_mobile_rec.onnx
170
+ β”‚ β”œβ”€β”€ ppocrv5_eslav_dict.txt
171
+ β”‚ └── config.json
172
+ β”‚
173
+ β”œβ”€β”€ korean/ # Korean (13 MB)
174
+ β”‚ β”œβ”€β”€ korean_PP-OCRv5_mobile_rec.onnx
175
+ β”‚ β”œβ”€β”€ ppocrv5_korean_dict.txt
176
+ β”‚ └── config.json
177
+ β”‚
178
+ β”œβ”€β”€ chinese/ # Chinese/Japanese (81 MB)
179
+ β”‚ β”œβ”€β”€ PP-OCRv5_server_rec.onnx
180
+ β”‚ β”œβ”€β”€ ppocrv5_dict.txt
181
+ β”‚ └── config.json
182
+ β”‚
183
+ β”œβ”€β”€ thai/ # Thai (7.5 MB)
184
+ β”‚ β”œβ”€β”€ th_PP-OCRv5_mobile_rec.onnx
185
+ β”‚ β”œβ”€β”€ ppocrv5_th_dict.txt
186
+ β”‚ └── config.json
187
+ β”‚
188
+ β”œβ”€β”€ greek/ # Greek (7.4 MB)
189
+ β”‚ β”œβ”€β”€ el_PP-OCRv5_mobile_rec.onnx
190
+ β”‚ β”œβ”€β”€ ppocrv5_el_dict.txt
191
+ β”‚ └── config.json
192
+ β”‚
193
+ └── preprocessing/ # Optional (43 MB)
194
+ β”œβ”€β”€ doc-orientation/
195
+ β”œβ”€β”€ textline-orientation/
196
+ └── doc-unwarping/
197
+ ```
198
+
199
+ Each model directory contains:
200
+ - **`.onnx`** - The model file
201
+ - **`.txt`** - Character dictionary
202
+ - **`config.json`** - Model metadata
203
+
204
+ ---
205
+
206
+ ## πŸ’‘ Why Use These Models?
207
+
208
+ ### βœ… Advantages
209
+
210
+ 1. **ONNX Format** - Fast inference, works on any platform (CPU/GPU)
211
+ 2. **No PaddlePaddle Required** - Just install `rapidocr-onnxruntime`
212
+ 3. **39+ Languages** - Multilingual support out of the box
213
+ 4. **Production Ready** - All models tested and validated
214
+ 5. **Complete Package** - Detection + Recognition + Dictionaries included
215
+ 6. **Well Documented** - Every model has detailed config and usage info
216
+
217
+ ### πŸ“Š Performance
218
+
219
+ - **Speed**: Fast inference on CPU (~100-300ms per image)
220
+ - **Accuracy**: 30% improvement over PP-OCRv3
221
+ - **Size**: Compact models (7-84 MB each)
222
+
223
+ ---
224
+
225
+ ## πŸ› οΈ Advanced Usage
226
+
227
+ ### With GPU Acceleration
228
+
229
+ ```bash
230
+ pip install onnxruntime-gpu
231
+ ```
232
+
233
+ Models will automatically use GPU if available for 10x faster inference.
234
+
235
+ ### Batch Processing
236
+
237
+ ```python
238
+ from rapidocr_onnxruntime import RapidOCR
239
+ import glob
240
+
241
+ ocr = RapidOCR(
242
+ det_model_path="detection/PP-OCRv5_server_det.onnx",
243
+ rec_model_path="latin/latin_PP-OCRv5_mobile_rec.onnx",
244
+ rec_keys_path="latin/ppocrv5_latin_dict.txt"
245
+ )
246
+
247
+ # Process all images in a folder
248
+ for image_path in glob.glob("documents/*.jpg"):
249
+ result, elapsed = ocr(image_path)
250
+ print(f"Processed {image_path} in {elapsed:.2f}s")
251
+ for line in result:
252
+ print(f" {line[1][0]}")
253
+ ```
254
+
255
+ ### With Preprocessing (for rotated/distorted documents)
256
+
257
+ ```python
258
+ # Enable angle classification for rotated text
259
+ ocr = RapidOCR(
260
+ det_model_path="detection/PP-OCRv5_server_det.onnx",
261
+ rec_model_path="english/en_PP-OCRv5_mobile_rec.onnx",
262
+ rec_keys_path="english/ppocrv5_en_dict.txt",
263
+ use_angle_cls=True,
264
+ angle_cls_model_path="preprocessing/textline-orientation/PP-LCNet_x1_0_textline_ori.onnx"
265
+ )
266
+ ```
267
+
268
+ ---
269
+
270
+ ## πŸ“– Model Details
271
+
272
+ ### How It Works
273
+
274
+ 1. **Detection** - Finds all text regions in the image
275
+ 2. **Recognition** - Reads text from each region using language-specific model
276
+ 3. **Decoding** - Converts model output to text using character dictionary
277
+
278
+ ### Model Specifications
279
+
280
+ - **Framework**: Converted from PaddlePaddle to ONNX
281
+ - **ONNX Opset**: 11
282
+ - **Precision**: FP32
283
+ - **Input**: RGB images (dynamic size)
284
+ - **Output**: Text + confidence scores + bounding boxes
285
+
286
+ ### Accuracy Benchmarks
287
+
288
+ Tested on official PP-OCRv5 datasets:
289
+
290
+ - Greek: 89.28%
291
+ - Korean: 88.0%
292
+ - English: 85.25%
293
+ - Latin: 84.7%
294
+ - Thai: 82.68%
295
+ - East Slavic: 81.6%
296
+
297
+ ---
298
+
299
+ ## 🎯 Use Cases
300
+
301
+ - **Document Digitization** - Scan and extract text from documents
302
+ - **Multilingual OCR** - Process documents in 39+ languages
303
+ - **Mobile Apps** - Lightweight models perfect for mobile deployment
304
+ - **Batch Processing** - Process thousands of documents efficiently
305
+ - **Real-time OCR** - Fast enough for real-time applications
306
+ - **Custom Pipelines** - Integrate into your existing workflows
307
+
308
+ ---
309
+
310
+ ## πŸ“ Language Selection Guide
311
+
312
+ | Your Document | Use This Model |
313
+ |---------------|----------------|
314
+ | English only | `english/` |
315
+ | French, German, Spanish, Italian, etc. | `latin/` (best choice for European languages) |
316
+ | Russian, Bulgarian, Ukrainian, Belarusian | `eslav/` |
317
+ | Korean | `korean/` |
318
+ | Chinese or Japanese | `chinese/` |
319
+ | Thai | `thai/` |
320
+ | Greek | `greek/` |
321
+ | Mixed European languages | `latin/` (supports 32 languages!) |
322
+
323
+ **Pro Tip**: The `latin/` model is the most versatile - it handles 32 different languages!
324
+
325
+ ---
326
+
327
+ ## ❓ FAQ
328
+
329
+ **Q: Do I need PaddlePaddle installed?**
330
+ A: No! These are ONNX models. Just install `rapidocr-onnxruntime`.
331
+
332
+ **Q: Can I use GPU?**
333
+ A: Yes! Install `onnxruntime-gpu` instead of `onnxruntime`.
334
+
335
+ **Q: Which model should I use for French?**
336
+ A: Use the `latin/` model - it supports French and 31 other languages.
337
+
338
+ **Q: Are these models free to use?**
339
+ A: Yes! Licensed under Apache 2.0.
340
+
341
+ **Q: How accurate are these models?**
342
+ A: Very accurate! PP-OCRv5 has 30% better accuracy than PP-OCRv3.
343
+
344
+ **Q: Can I use these commercially?**
345
+ A: Yes! Apache 2.0 license allows commercial use.
346
+
347
+ ---
348
+
349
+ ## πŸ”— Links
350
+
351
+ - **Original Models**: [PaddlePaddle PP-OCRv5 Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv5-684a5356aef5b4b1d7b85e4b)
352
+ - **PaddleOCR GitHub**: [github.com/PaddlePaddle/PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
353
+ - **Documentation**: [PaddleOCR Docs](https://paddlepaddle.github.io/PaddleOCR/)
354
+ - **RapidOCR**: [github.com/RapidAI/RapidOCR](https://github.com/RapidAI/RapidOCR)
355
+ - **ONNX Runtime**: [onnxruntime.ai](https://onnxruntime.ai/)
356
+
357
+ ---
358
+
359
+ ## πŸ™ Credits
360
+
361
+ - **Original Models**: [PaddlePaddle Team](https://github.com/PaddlePaddle/PaddleOCR)
362
+ - **Conversion**: Community contribution using [paddle2onnx](https://github.com/PaddlePaddle/Paddle2ONNX)
363
+ - **Based on**: [PP-OCRv5 Official Collection](https://huggingface.co/collections/PaddlePaddle/pp-ocrv5-684a5356aef5b4b1d7b85e4b)
364
+
365
+ ---
366
+
367
+ ## πŸ“„ License
368
+
369
+ Apache License 2.0 (inherited from PaddleOCR)
370
+
371
+ You are free to:
372
+ - βœ… Use commercially
373
+ - βœ… Modify
374
+ - βœ… Distribute
375
+ - βœ… Use privately
376
+
377
+ ---
378
+
379
+ ## πŸ› Issues & Support
380
+
381
+ For issues with:
382
+ - **These ONNX models**: Open an issue in this repository
383
+ - **Original PaddleOCR models**: [PaddleOCR Issues](https://github.com/PaddlePaddle/PaddleOCR/issues)
384
+ - **ONNX Runtime**: [onnxruntime Issues](https://github.com/microsoft/onnxruntime/issues)
385
+
386
+ ---
387
+