Documentation Index
Fetch the complete documentation index at: https://docs.upsonic.ai/llms.txt
Use this file to discover all available pages before exploring further.
What is PaddleOCR?
Comprehensive OCR with multiple specialized pipelines for advanced document understanding.
Usage
from upsonic.ocr import OCR
from upsonic.ocr.layer_1.engines import (
PaddleOCREngine, PPStructureV3Engine,
PPChatOCRv4Engine, PaddleOCRVLEngine
)
# Also available: from upsonic.ocr import PaddleOCREngine, PPStructureV3Engine, ...
# General OCR (PP-OCRv5)
engine = PaddleOCREngine(lang='en', ocr_version='PP-OCRv5')
ocr = OCR(layer_1_ocr_engine=engine)
text = ocr.get_text('document.pdf')
# Advanced document structure recognition
engine_structure = PPStructureV3Engine(
use_table_recognition=True,
use_formula_recognition=True
)
ocr_structure = OCR(layer_1_ocr_engine=engine_structure)
result = ocr_structure.process_file('research_paper.pdf')
# Chat-based document understanding
engine_chat = PPChatOCRv4Engine(
use_table_recognition=True,
use_seal_recognition=True
)
ocr_chat = OCR(layer_1_ocr_engine=engine_chat)
# Vision-Language document understanding
engine_vl = PaddleOCRVLEngine(
use_layout_detection=True,
use_chart_recognition=True,
format_block_content=True
)
ocr_vl = OCR(layer_1_ocr_engine=engine_vl)
PaddleOCREngine (General OCR)
| Parameter | Type | Default | Description |
|---|
lang | str | 'en' | Language code |
ocr_version | str | 'PP-OCRv5' | OCR version (‘PP-OCRv3’, ‘PP-OCRv4’, ‘PP-OCRv5’) |
use_doc_orientation_classify | bool | None | Enable document orientation classification |
use_doc_unwarping | bool | None | Enable document unwarping |
use_textline_orientation | bool | None | Enable text line orientation detection |
text_det_limit_side_len | int | None | Limit on detection input side length |
text_rec_score_thresh | float | None | Text recognition score threshold |
return_word_box | bool | None | Return word-level bounding boxes |
PPStructureV3Engine (Document Structure)
| Parameter | Type | Default | Description |
|---|
use_table_recognition | bool | None | Enable table recognition |
use_formula_recognition | bool | None | Enable formula recognition |
use_seal_recognition | bool | None | Enable seal text recognition |
use_chart_recognition | bool | None | Enable chart recognition |
layout_threshold | float | None | Layout detection score threshold |
lang | str | 'en' | Language code |
PPChatOCRv4Engine (Chat-based OCR)
| Parameter | Type | Default | Description |
|---|
use_table_recognition | bool | None | Enable table recognition |
use_seal_recognition | bool | None | Enable seal recognition |
mllm_chat_bot_config | dict | None | Multimodal LLM configuration |
retriever_config | dict | None | Retriever configuration for vector search |
PaddleOCRVLEngine (Vision-Language)
| Parameter | Type | Default | Description |
|---|
use_layout_detection | bool | None | Enable layout detection |
use_chart_recognition | bool | None | Enable chart recognition |
format_block_content | bool | None | Format content as Markdown |
vl_rec_backend | str | 'local' | VL recognition backend |
temperature | float | None | Sampling temperature for VLM |
Supported Languages
40+ languages for PP-OCRv5, with extensive support in PP-OCRv3 for Asian, European, and Middle Eastern languages.